As mentioned in the first part of this series: Python Database Programming with MongoDB, the Python module PyMongo is required for Python to be able to communicate with a MongoDB database. To install this, use the command at the Windows Command Prompt:
pip3 install pymongo
Installing PyMongo should produce an output similar to what is shown below:
Figure 1 – Installing the PyMongo Module
Depending on the Python configuration, an additional module named dnspython may also be necessary:
pip3 install dnspython
Figure 2 – Installing the dnspython modules
Reading: Top Online Courses to Learn Python Programming
How to Insert Data in MongoDB with Python
The code below will create 15 randomly generated Artists and two Albums for each of them:
# bad-band-name-maker-nosql.py import sys import random import pymongo part1 = [“The”, “Uncooked”, “Appealing”, “Larger than Life”, “Drooping”, “Unwell”, “Atrocious”, “Glossy”, “Barrage”, “Unlawful”]
part2 = [“Defeated”, “Hi-Fi”, “Extraterrestrial”, “Adumbration”, “Limpid”, “Looptid”, “Cromulent”, “Unsettled”, “Soot”, “Twinkle”]
part3 = [“Brain”, “Segment”, “”Audio””, “Legitimate Business”, “”Bob””, “Sound”, “Canticle”, “Monsoon”, “Preserves”, “”Cacophony””]
part4 = [“Cougar”, “Lion”, “Lynx”, “Ocelot”, “Puma”, “Jaguar”, “Panther”]
part5 = [“Fodder”, “Ersatz Goods”, “Leftovers”, “Infant Formula”, “Mush”, “Smoothie”, “Milkshakes”]
def main(argv): # Connect to the RazorDemo database. client = pymongo.MongoClient(“mongodb+srv://yourUser:[email protected]/RazorDemo?retryWrites=true&w=majority”, serverSelectionTimeoutMS=5000) artistsCollection = client[“RazorDemo”][“Artists”]
albumsCollection = client[“RazorDemo”][“Albums”]
# Generate 15 bad band names, and try to keep them unique. previousNames=”” nameCount=0 artistJson= []
while (nameCount < 16): rand1 = random.randrange(0, 9) rand2 = random.randrange(0, 9) rand3 = random.randrange(0, 9) badName = part1[rand1] + ' ' + part2[rand2] + ' ' + part3[rand3]
# Unlike with SQL-oriented databases, MongoDB allows for the insertion of multiple documents in a single statement. # In this case, the code will build a JSON list of all the band names to be inserted in a one fell swoop. if ("|" + previousNames + "|").find("|" + badName + "|") == -1: #print ("Band name [" + str(nameCount) + "] is [" + badName + "]") # Don't forget to escape quotation marks! jsonEntry = { "artist_name" : badName } artistJson.append(jsonEntry) # Because there are no foreign key rules, the album names can be created # and committed to the database before the artist names have been created.albumJson = []
for y in range(1, 3): rand4 = random.randrange(0, len(part4)) rand5 = random.randrange(0, len(part5)) # No checks for uniqueness here. Peter Gabriel had 4 self-titled # albums after all. albumName = part4[rand4] + " " + part5[rand5]
albumEntry = { "artist_name" : badName, "album_name" : albumName } albumJson.append(albumEntry) print (albumJson) albumsCollection.insert_many(albumJson) # Creates a bar-delimited list of previously used names. # MongoDB expects the application to enforce data integrity rules. if previousNames == "": previousNames = badName else: previousNames = previousNames + "|" + badName nameCount = 1 + nameCount else: print ("Found a duplicate of [" + badName + "]") print (artistJson) artistsCollection.insert_many(artistJson) # Close the Connection client.close() return 0 if __name__ == "__main__": main(sys.argv[1:]) Listing 6 - Creating Random Data
One interesting observation about this code, at least compared to the SQL-oriented examples in Python Database Programming with SQL Express for Beginners, is that it is much simpler, as there is no additional SQL component. The JSON functions are already a part of Python and the only MongoDB-related command is the insert_many() functions that are executed after each dataset is created. Even more convenient, these commands match the same syntax in Python that is used in the MongoDB Shell.
From a security standpoint, issues like SQL Injection simply do not exist in such code, not just because there is no SQL being executed, but absolutely no code whatsoever is being passed into the database. The Python List functionality also takes care of problems like escaping quotation marks.
Instead of showing the output in the Command Prompt window, another piece of code will be used to query the database instead.
Reading: How to Sort Lists in Python
Validating the Inserts with Python
The code below will query the MongoDB database for the insert actions made above using Python:
# bad-band-name-display-nosql.py import sys import pymongo def main(argv): # Connect to the RazorDemo database. client = pymongo.MongoClient(“mongodb+srv://yourUser:[email protected]/RazorDemo?retryWrites=true&w=majority”, serverSelectionTimeoutMS=5000) artistsCollection = client[“RazorDemo”][“Artists”]
albumsCollection = client[“RazorDemo”][“Albums”]
print (“Albums:”) artists = artistsCollection.find() for artist in artists: print (str(artist[“artist_name”])) albumQuery = { “artist_name”: {“$eq” : str(artist[“artist_name”])} } albumsForThisArtist = albumsCollection.find(albumQuery) for album in albumsForThisArtist: print (“t” + str(album[“album_name”])) # Close the Connection client.close() return 0 if __name__ == “__main__”: main(sys.argv[1:]) Listing 7 – Validating the Insert Actions
The output below contains the initial documents created further up in the document:
Figure 3 – Validating the Inserts
Querying MongoDB Data with Python
The code above can be adapted into an interactive tool to query the data with user input. MongoDB provides a powerful text search tool for its collections, but in order to enable it, text indexes must be created on the collections to be searched:
db.Artists.createIndex({artist_name: “text”}) db.Albums.createIndex({artist_name: “text”, album_name: “text”}) Listing 8 – Creating Text Indices for each collection
Note that MongoDB only allows for one text index per collection. Attempting to create another index for a different node in a collection will cause an error. The output of these commands in MongoDB Shell is below:
Figure 4 – Adding text indices
While the text search tool can perform all sorts of crazy matching logic involving regular expressions and partial matches with closeness ranking, the example below will stick with simple matching, in order to illustrate the proof of concept:
# bad-band-name-query-nosql.py import sys import pymongo def main(argv): searchValue = input(“Enter something: “) # Cap the length at something reasonable. The first 20 characters. searchValue = searchValue[0:20]
# Set the search value to lower case so we can perform case-insensitive matching: searchValue = searchValue.lower() # Connect to the RazorDemo database. client = pymongo.MongoClient(“mongodb+srv://yourUser:[email protected]/RazorDemo?retryWrites=true&w=majority”, serverSelectionTimeoutMS=5000) artistsCollection = client[“RazorDemo”][“Artists”]
albumsCollection = client[“RazorDemo”][“Albums”]
matchedArtists = “”; artists = artistsCollection.find( { “$text”:{ “$search”: searchValue} }) for artist in artists: matchedArtists = matchedArtists + “t” + str(artist[“artist_name”]) + “rn” if “” == matchedArtists: print (“No matched artists.”) else: print (“Matched Artists:”) print (matchedArtists) albums = albumsCollection.find( { “$text”: { “$search”: searchValue} }) matchedAlbums = “” for album in albums: matchedAlbums = matchedAlbums + “t” + str(album[“artist_name”]) + ” – ” + str(album[“album_name”]) + “rn” if “” == matchedAlbums: print (“No matched albums.”) else: print (“Matched Albums:”) print (matchedAlbums) # Close the Connection client.close() return 0 if __name__ == “__main__”: main(sys.argv[1:]) Listing 9 – Querying the data
Note that no conversion of the data coming out of MongoDB was needed to match it to the lowercase version of the search term.
Final Thoughts on Python and MongoDB Development
For developers who have been coding against SQL-oriented database servers and databases, the leap to noSQL can feel like scaling a very steep learning curve, but by mapping familiar SQL database concepts to their NoSQL counterparts, it becomes a little less uncomfortable of a climb . Such developers may even be shocked at the lack of “basic” “features” such as foreign key enforcement or the expectation that it is the application and not the database that is expected to enforce data integrity rules. For very seasoned SQL-oriented database developers, even the mere thought of such ideas almost feels like programming heresy!
But NoSQL databases like MongoDB add many other features that make the change in thinking worth it. Not needing to worry about yet another version of SQL that is “just different enough” to be annoying, or not having to think about issues like SQL injection, being able to insert multiple records, err, documents of data securely without the hassle of “ thousands” of individual statements, and perhaps even entertaining the “crazy” idea that having the application do the data enforcement shaves off a huge chunk of application development efforts makes it all worth considering.
read more Python programming tutorials and software development guides.