Creating an index on a MongoDB collection using Python

Overview:

  • Database indexes aid in faster retrieval of data.
  • A database index is like the index pages of a book where the index points to actual page number of a specific term or a topic in the book. Without the index a reader will find it most difficult to locate a term inside the book. He or she has to scan through all the pages or make a knowledge based guess to search all the pages of one or more chapters.
  • MongoDB indexes can be built on a single field, multiple fields, text fields, latitude and longitude fields, array entries and on the hash values of a field.
  • MongoDB also supports partial indexes, unique indexes, sparse indexes and time bound indexes.
  • If there is an index associated with a MongoDB collection, a MongoDB query will use the index if it deems to optimize the execution time for a query. For example, certain composite indexes may not get used and the MongoDB may choose to use a full scan instead.

 

Example:

from pymongo import MongoClient

import pymongo

 

# Create a connection to the MongoDB server

con = MongoClient('mongodb://localhost:27017/');

 

# Database object

db      = con.sample;

 

# Collection object

coll    = db.FictionBooks;

 

#coll.drop_index("FictionYearIndex");

 

 

# Delete existing documents from the MongoDB collection

coll.delete_many({});

 

# Insert documents into the MongoDB collection

coll.insert({ "title": "Alice's Adventures in Wonderland", "author": "Lewis Carroll", "year": 1865 });

coll.insert({ "title": "Through the Looking-Glass", "author": "Lewis Carroll", "year": 1871 });

coll.insert({ "title": "Guide to the Lakes", "author": "William Wordsworth", "year":  1810 });

coll.insert({ "title": "Gone with the Wind", "author": "Margaret Mitchell", "year": 1936 });

coll.insert({ "title": "Atlas Shrugged", "author": "Ayn Rand", "year": 1957 });

coll.insert({ "title": "A Wild Sheep Chase", "author": "Haruki Murakami", "year": 1982 });

 

# Print the query plan

queryStatistics = coll.find({"year":1982}).explain()["executionStats"];

print("Query execution statistics before index creation:");

print(queryStatistics);

 

# Create an index on the year field

coll.create_index([('year', pymongo.ASCENDING )], name='FictionYearIndex', default_language='english');

 

# Print the query plan

print();

print("Query execution statistics after index creation:");

queryStatistics = coll.find({"year":1982}).explain()["executionStats"];

print(queryStatistics);

 

Output:

The series:

0    1

1    1

2    1

3    1

4    1

5    1

dtype: int64

is monotonic.

Query execution statistics before index creation:

{'executionSuccess': True, 'nReturned': 1, 'executionTimeMillis': 0, 'totalKeysExamined': 0, 'totalDocsExamined': 6, 'executionStages': {'stage': 'COLLSCAN', 'filter': {'year': {'$eq': 1982}}, 'nReturned': 1, 'executionTimeMillisEstimate': 0, 'works': 8, 'advanced': 1, 'needTime': 6, 'needYield': 0, 'saveState': 0, 'restoreState': 0, 'isEOF': 1, 'invalidates': 0, 'direction': 'forward', 'docsExamined': 6}, 'allPlansExecution': []}

 

Query execution statistics after index creation:

{'executionSuccess': True, 'nReturned': 1, 'executionTimeMillis': 0, 'totalKeysExamined': 1, 'totalDocsExamined': 1, 'executionStages': {'stage': 'FETCH', 'nReturned': 1, 'executionTimeMillisEstimate': 0, 'works': 2, 'advanced': 1, 'needTime': 0, 'needYield': 0, 'saveState': 0, 'restoreState': 0, 'isEOF': 1, 'invalidates': 0, 'docsExamined': 1, 'alreadyHasObj': 0, 'inputStage': {'stage': 'IXSCAN', 'nReturned': 1, 'executionTimeMillisEstimate': 0, 'works': 2, 'advanced': 1, 'needTime': 0, 'needYield': 0, 'saveState': 0, 'restoreState': 0, 'isEOF': 1, 'invalidates': 0, 'keyPattern': {'year': 1}, 'indexName': 'FictionYearIndex', 'isMultiKey': False, 'multiKeyPaths': {'year': []}, 'isUnique': False, 'isSparse': False, 'isPartial': False, 'indexVersion': 2, 'direction': 'forward', 'indexBounds': {'year': ['[1982, 1982]']}, 'keysExamined': 1, 'seeks': 1, 'dupsTested': 0, 'dupsDropped': 0, 'seenInvalidated': 0}}, 'allPlansExecution': []}


Copyright 2024 © pythontic.com