Reputation: 306
I have a table of traces with following structure
id | country_iso | object_id | created_at
Each object can have many entries for one or more country. I need to figure out how many objects are now in each country.
I can't just group it by country_iso, because in this case I'll get number of entries for each country, but not number of objects.
I can't group it by country_iso and object_id, because in this time I'll get number of entries for each object in each country.
So, seems I have to get last entry for each object and then group them by country and get quantity. Also I need to avoid cases when last entry has country_iso equal to null. So if last entry for object is null, we have to get entry before last and so one ( so, the latest with not null country_iso ).
Example:
1 | US | 1 | 25.02.02
2 | null | 1 | 26.02.02
3 | UK | 2 | 25.02.02
4 | UK | 3 | 25.02.02
5 | UK | 4 | 25.02.02
6 | US | 4 | 26.02.02 `
Result will be
US | 2
UK | 2
Thanks in advance for any ideas.
P.S.: Note that there's a lot of data( more than 100 000 of objects and more than 10 entries for each) and it's on remote server. So, I can't get data and somehow recalculate them using ruby on main server.
Upvotes: 1
Views: 438
Reputation: 9094
A less clever way is to abuse distinct
...
counts = MyModel.distinct(:country_iso).map do |country_iso|
count = MyModel.where(:country_iso => country_iso).distinct(:object_id).count
[country_iso, count]
end
...though that's 1 query per country. May not be horrid @ 1ms per query.
Upvotes: 0
Reputation: 9094
You'll want a map-reduce, or the shorter group version.
http://www.mongodb.org/display/DOCS/Aggregation#Aggregation-Group
http://api.mongodb.org/ruby/current/Mongo/Collection.html#group-instance_method
https://github.com/mongodb/mongo-ruby-driver/blob/master/lib/mongo/collection.rb
Here's a shot at the code:
MyModel.collection.group(
:key => :country_iso
:initial => { :object_id_set => {} },
:reduce => 'function (obj,prev) { prev.object_id_set[obj.object_id] = 1; }',
:finalize => 'function (final) { final.object_count = keys(final.object_id_set).length }'
)
Upvotes: 1