Kate Bradley
Kate Bradley

Reputation: 43

How to copy select documents from one collection to another with pymongo?

I've been streaming data from twitter into a mongo database. However I found that I hadn't formatted the search incorrectly, so I got data from all over the place instead of the one city I wanted (I find location by checking if the city name comes up in 'location' or 'name' under 'user' in the json).

I want to copy just the correct documents to a new collection, but I've found it nearly impossible to do in pymongo! I'm using pymongo instead of the shell because I'm using regular expressions to search for the city names(there's a lot of synonyms for it).

regex=re.compile(<\really long regular expression of city names>)

I've been able to use find() correctly with the regular expressions; it returns just what I'm looking for:

db.coll.find({'$or':[{'user.location':{'$in':[regex]}},{'user.name':{'in':[regex]}}]})

I just need to copy what it returns into a new collection, but it's proving difficult.

I tried this method, trying forEach() to try to copy the documents, using bson wrapping, which I found here, but it still won't work.

 db.coll.find({'$or':[{'user.location':{'$in':[regex]}},{'user.name':{'in' [regex]}}]})\
.forEach(bson.Code( '''

function(doc) { 
   db.subset.insert(doc);

 }'''))

Specifically, the error I get when I try this is

AttributeError: 'Cursor' object has no attribute 'forEach'

I have no idea what is wrong or how I can go about fixing this. Anyone able to tell me what I can do to fix this, or a better way to copy documents to a new collection?

Upvotes: 0

Views: 2508

Answers (1)

ThrowsException
ThrowsException

Reputation: 2636

A cursor is already able to go through the results you don't need to forEeach. Try

for tweet in db.coll.find({'$or':[{'user.location':{'$in':[regex]}},{'user.name':{'in' [regex]}}]}):
    db.subset.insert(tweet)

Upvotes: 1

Related Questions