Landon
Landon

Reputation: 4108

map_reduce() *and* find() in same query

I've found a bunch of map_reduce tutorials around, but none of them seem to have a "where" clause in them or any other way to exclude documents/records from what's being considered. I'm working on a seemingly easy query. I have a basic log file of events with timestamps, ip addresses, and campaign ids. I want to get a count of unique users, within a given timestamp range, for a given campaign. Sounds easy!

I built out a query object that is something like this:

{'ts': {'$gt': 1345840456, '$lt': 2345762454}, 'cid': '2636518'}

With that, I've tried two things, one using distinct, and the other with map_reduce:

Distinct

db.alpha2.find(query).distinct('ip').count()

In the mongo shell, you can put the query as a second parameter of the distinct function, and it works there, but I've read that you can't do that in pymongo.

Map_reduce

map = Code("function () {"
        "    emit(this.ip, 1);"
        "}")
reduce = Code("function (key, values) {"
    "  var total = 0;"
    "  for (var i = 0; i < values.length; i++) {"
    "    total += values[i];"
    "  }"
    "  return total;"
    "}")

totaluniqueimp = db.alpha2.map_reduce(map, reduce, "myresults").count();

(I realize the reduce function is doing stuff I don't need, I took it from the demo). This works fine, but makes no use of my "where" paramaters. I try this:

totaluniqueimp = db.alpha2.find(query).map_reduce(map, reduce, "myresults").count();`

And I get this error:

AttributeError: 'Cursor' object has no attribute 'map_reduce'

Conclusion

Basically, this is what I'm trying to do in mysql:

select count(*) from records where ts<1000 and ts>900 and campaignid=234 group by ipaddress

It seems so simple! How do you do this in mongo?

UPDATE: ANSWER

Based off of Dmitry's answer below, I was able to solve (and simplify) my solution to (is this as simple as I can make it?):

#query is an object that was built above this
map = Code("function () { emit(this.ip, 1);}")
reduce = Code("function (key, values) {return 1;}")
totaluniqueimp = collection.map_reduce(map, reduce, "myresults", query=query).count();

Thanks Dmitry!

Upvotes: 4

Views: 681

Answers (2)

arober11
arober11

Reputation: 2019

Not sure if this is possible via pymongo, the manual indicates it should be, but in the mongoDB shell you have a group() function, that will easily allow you to re-write the SQL in your question:

select count(*) 
  from records 
 where ts<1000 
   and ts>900 
   and campaignid=234
 group by ipaddress;

As:

db. alpha2.group(
   { cond: { 'ts': {'$gt': 900, '$lt': 1000}, 'campaignid': '234' }
   , key: {  "ipaddress" : 1 }
   , initial: {count : 0}
   , reduce: function(doc, out){ out.count++}
   }
);

Upvotes: 0

Dmitry
Dmitry

Reputation: 4442

You could try using this:

totaluniqueimp = db.alpha2.map_reduce(map, reduce, {
    out: "myresults",
    query: {'ts': {'$gt': 1345840456, '$lt': 2345762454}, 'cid': '2636518'}
}).count();

UPDATE: the statement above works in the mongo shell. In pymongo you should add the query as the fourth parameter:

totaluniqueimp = db.alpha2.map_reduce(map, reduce, "myresults", query={'ts': {'$gt': 1345840456, '$lt': 2345762454}, 'cid': '2636518'})

The detailed documentation can be found here.

Upvotes: 4

Related Questions