ShubhamHAgrawal
ShubhamHAgrawal

Reputation: 1014

db.collection.find() taking too much to return complete data

I am doing the following to pull complete data from MongoDB collection.

db_client = MongoClient(host='host')
db_database = db_client['db_name']
raw_data = db_database.collection_name.find()
result_data = [row for row in raw_data]
return result_data

It is taking too much time to return. What is the best way to fetch complete data from the collection?

Upvotes: 1

Views: 231

Answers (2)

Johnny Metz
Johnny Metz

Reputation: 5965

Skip the list comprehension step completely (which is probably the reason for the lag) by converting the cursor to a list right off the bat:

raw_data = list(db_database.collection_name.find())

Upvotes: 1

kevinadi
kevinadi

Reputation: 13775

If you have a lot of documents, this line:

result_data = [row for row in raw_data]

is where Python spends most of its time.

Depending on what you want to do with the documents, you may be able to do:

for row in raw_data:
    # process each row
    print row

However, if you intend to return the whole collection and not processing it, you are doing a collection scan (equivalent to a table scan in SQL) and creating a large Python data structure. By definition, either processes won't be fast. Combined, they're going to be very slow, and there's no workaround that I'm aware of.

If your intent is to dump the whole collection, you may want to look at mongodump or mongoexport instead, which are designed to perform this task.

Upvotes: 1

Related Questions