Reputation: 10498
I have been using CouchDB for quite sometime without any issues. That is up until now. I recently saw something in my map/reduce results which I had overlooked!
This is before performing a sum
on the "avgs" variable. I'm basically trying to find the average of all values pertaining to a particular key. Nothing fancy. The result is as expected.
Note the result for timestamp 1308474660000 (4th row in the table):
Now I sum
the "avgs" array. Now here is something that is peculiar about the result. The sum for the key with timestamp 1308474660000 is a null
!! Why is CouchDB spitting out null
s for a simple sum
? I tried with a custom addition function and its the same problem.
Can someone explain to me why is there this issue with my map/reduce result?
CouchDB version: 1.0.1
UPDATE:
After doing a rereduce I get a reduce overflow error!
Error: reduce_overflow_error
Reduce output must shrink more rapidly: Current output: '["001,1,1,1,1,1,11,1,1,1,1,1,1,11,1,1,1,1,1,1,11,1,1,1,1,1,1,11,1,1,1,1,1,101,1,1,1,1,1,1,11,1,1,1,1'... (first 100 of 396 bytes)
This is my modified reduce function:
function (key, values, rereduce) {
if(!rereduce) {
var avgs = [];
for(var i=values.length-1; i>=0 ; i--) {
avgs.push(Number(values[i][0])/Number(values[i][1]));
}
return avgs;
} else {
return sum(values);
};
}
UPDATE 2:
Well now it has gotten worse. Its selectively rereducing. Also, the ones it has rereduced show wrong results. The length of the value in 4th row for timestamp (1308474660000) should be 2 and not 3.
UPDATE 3:
I finally got it to work. I hadn't understood the specifics of rereduce properly. AFAIK, Couchdb itself decides how to/when to rereduce. In this example, whenever the array was long enough to process, Couchdb would send it to rereduce. So I basically had to sum
twice. Once in reduce, and again in rereduce.
function (key, values, rereduce) {
if(!rereduce) {
var avgs = [];
for(var i=values.length-1; i>=0 ; i--) {
avgs.push(Number(values[i][0])/Number(values[i][1]));
}
return sum(avgs);
} else {
return sum(values); //If my understanding of rereduce is correct, it only receives only the avgs that are large enough to not be processed by reduce.
}
}
Upvotes: 1
Views: 1029
Reputation: 239
I use the following code to do average. Hope it helps.
function (key, values) {
return sum(values)/values.length;
}
Upvotes: -1
Reputation: 73752
I will elaborate on my count/sum comment, just in case you are curious.
This code is not tested, but hopefully you will get the idea. The end result is always a simple object {"count":C, "sum":S}
and you know the average by computing S / C
.
function (key, values, rereduce) {
// Reduce function
var count = 0;
var sum = 0;
var i;
if(!rereduce) {
// `values` stores actual map output
for(i = 0; i < values.length; i++) {
count += Number(values[i][1]);
sum += Number(values[i][0]);
}
return {"count":count, "sum":sum};
}
else {
// `values` stores count/sum objects returned previously.
for(i = 0; i < values.length; i++) {
count += values[i].count;
sum += values[i].sum;
}
return {"count":count, "sum":sum};
}
}
Upvotes: 1
Reputation: 73752
Your for
loop in the reduce function is probably not doing what you think it is. For example, it might be throwing an exception that you did not expect.
You are expecting an array of 2-tuples:
// Expectation
values = [ [value1, total1]
, [value2, total2]
, [value3, total3]
];
During a re-reduce, the function will get old results from itself before.
// Re-reduce values
values = [ avg1
, avg2
, avg3
]
Therefore I would begin by examining how your code works if and when rereduce
is true. Perhaps something simple will fix it (although often I have to log()
things until I find the problem.)
function(keys, values, rereduce) {
if(rereduce)
return sum(values);
// ... then the same code as before.
}
Upvotes: 2