Reputation: 1882
For a monitoring an application with CouchDB I need to sum up a field of my data (for example the time needed to execute a method that has been logged).
That's no problem for me with map-reduce, but I need to sum up only the data recorded in a special time slice.
Example records:
{_id: 1, methodID:1, recorded: 100, timeneeded: 10},
{_id: 2, methodID:1, recorded: 200, timeneeded: 11},
{_id: 3, methodID:2, recorded: 200, timeneeded: 2},
{_id: 4, methodID:1, recorded: 300, timeneeded: 6},
{_id: 5, methodID:2, recorded: 310, timeneeded: 3},
{_id: 6, methodID:1, recorded: 400, timeneeded: 9}
Now I would like to get just the sum of timeneeded
of all records that have been recorded
in the range of 200 to 350 and grouped by methodID
. (That would be 17 for methodID:1
and 5 for methodID:2
.)
How can I do that?
I now tried it with a list function that's using WickedGrey's idea. See my functions here:
map function:
function(doc) {
emit([ doc.recorded], {methodID:doc.methodID, timeneeded:doc.timeneeded});
}
list function:
"function(head, req) {
var combined_values = {};
var row;
while (row = getRow()) {
if( row.values.methodID in combined_values) {
combined_values[ row.values.methodID] +=row.values.timeneeded;
}
else {
combined_values[ row.values.methodID] = row.values.timeneeded;
}
}
for(var methodID in combined_values){
send( toJSON({method: methodID, timeneeded:combined_values[methodID]}) );
}
}"
Now I have to problems: 1. I always get the results as a file and my firefox asks me if I want to download it, instead of viewing it in the browser like when I query a classic view. 2. As I understand the thing, the results are now calculated on the fly, in the list function. I expect this to be not really fast with hundrets of millions of records... Any ideas how to get it faster?
Thank you for your help! andy
Upvotes: 0
Views: 741
Reputation: 101
function map(doc) {
if(doc.methodID && doc.recorded && doc.timeneeded) {
emit([doc.methodID,doc.recorded], doc.timeneeded);
}
}
//reduce
_sum
Upvotes: 1
Reputation: 1447
You can't use a map key to filter by one set of criteria, but group by another in CouchDB. However, you can filter the keys by time range, and group with a reduce function. Try something like this:
function map(doc) {
emit(doc.recorded, {doc.methodID: doc.timeneeded});
}
function reduce(key, values, rereduce) {
var combined_values = {};
for (var i in values) {
var totals = values[i];
for (var methodID in totals) {
if (methodID in combined_values) {
combined_values[methodID] += totals[methodID];
}
else {
combined_values[methodID] = totals[methodID];
}
}
}
return combined_values;
}
That should allow you to specify a start/end key, and with group_level=0 should get you a value containing the dictionary that you're looking for.
Edit: Also, this thread might be of interest:
http://couchdb-development.1959287.n2.nabble.com/reduce-limit-error-td2789734.html
It discusses an option to turn off the reduce must shrink message, and further down the list provides other ways of achieving the same goal: using a list function. That might be a better approach that what I've outlined here. :(
Upvotes: 1