user1700890
user1700890

Reputation: 7742

pymongo - executing parallel queries

Here is pseudo code that I would like to parallelize, but don't know where to start

from pymongo import MongoClient


client = MongoClient('localhost', 27017)
db = client['myDB']
collection = db.myCollection

test_list = ['foo', 'bar']
result_list = list()

for el in test_list:
     result_list.append(collection.distinct('attrib',{'version': el}))

I know how to create parallel loop with joblib, but I am not sure how to query MongoDB in parallel, should I create multiple clients or collections? Will the above code work if I simply re-write it with joblib without caring about MongoDB?

Upvotes: 3

Views: 2439

Answers (1)

wowkin2
wowkin2

Reputation: 6355

You can run requests in separate threads:

from multiprocessing.dummy import Pool as ThreadPool 

from pymongo import MongoClient


client = MongoClient('localhost', 27017)
db = client['myDB']
collection = db.myCollection

thread_pool_size = 4
pool = ThreadPool(thread_pool_size) 

def my_function(el):
    return collection.distinct('attrib', {'version': el}))

test_list = ['foo', 'bar']
result_list = pool.map(my_function, test_list)

Upvotes: 0

Related Questions