Shripad Krishna
Shripad Krishna

Reputation: 10498

Peculiar Map/Reduce result from CouchDB

I have been using CouchDB for quite sometime without any issues. That is up until now. I recently saw something in my map/reduce results which I had overlooked!

This is before performing a sum on the "avgs" variable. I'm basically trying to find the average of all values pertaining to a particular key. Nothing fancy. The result is as expected.

Note the result for timestamp 1308474660000 (4th row in the table): Before summing avgs


Now I sum the "avgs" array. Now here is something that is peculiar about the result. The sum for the key with timestamp 1308474660000 is a null!! Why is CouchDB spitting out nulls for a simple sum? I tried with a custom addition function and its the same problem.

enter image description here Can someone explain to me why is there this issue with my map/reduce result?

CouchDB version: 1.0.1


UPDATE:

After doing a rereduce I get a reduce overflow error!

Error: reduce_overflow_error

Reduce output must shrink more rapidly: Current output: '["001,1,1,1,1,1,11,1,1,1,1,1,1,11,1,1,1,1,1,1,11,1,1,1,1,1,1,11,1,1,1,1,1,101,1,1,1,1,1,1,11,1,1,1,1'... (first 100 of 396 bytes)

This is my modified reduce function:

function (key, values, rereduce) {
  if(!rereduce) {
    var avgs = [];
    for(var i=values.length-1; i>=0 ; i--) {
      avgs.push(Number(values[i][0])/Number(values[i][1]));
    }
    return avgs;
  } else {
    return sum(values);
  };
}

UPDATE 2:

Well now it has gotten worse. Its selectively rereducing. Also, the ones it has rereduced show wrong results. The length of the value in 4th row for timestamp (1308474660000) should be 2 and not 3.

enter image description here

UPDATE 3:

I finally got it to work. I hadn't understood the specifics of rereduce properly. AFAIK, Couchdb itself decides how to/when to rereduce. In this example, whenever the array was long enough to process, Couchdb would send it to rereduce. So I basically had to sum twice. Once in reduce, and again in rereduce.

function (key, values, rereduce) {
  if(!rereduce) {
    var avgs = [];
    for(var i=values.length-1; i>=0 ; i--) {
      avgs.push(Number(values[i][0])/Number(values[i][1]));
    }
    return sum(avgs);
  } else {
    return sum(values); //If my understanding of rereduce is correct, it only receives only the avgs that are large enough to not be processed by reduce.
  }
}

Upvotes: 1

Views: 1029

Answers (3)

Jackie Lee
Jackie Lee

Reputation: 239

I use the following code to do average. Hope it helps.

function (key, values) {
    return sum(values)/values.length;    
}

Upvotes: -1

JasonSmith
JasonSmith

Reputation: 73752

I will elaborate on my count/sum comment, just in case you are curious.

This code is not tested, but hopefully you will get the idea. The end result is always a simple object {"count":C, "sum":S} and you know the average by computing S / C.

function (key, values, rereduce) {
  // Reduce function
  var count = 0;
  var sum = 0;
  var i;

  if(!rereduce) {
    // `values` stores actual map output
    for(i = 0; i < values.length; i++) {
      count += Number(values[i][1]);
      sum += Number(values[i][0]);
    }

    return {"count":count, "sum":sum};
  }

  else {
    // `values` stores count/sum objects returned previously.
    for(i = 0; i < values.length; i++) {
      count += values[i].count;
      sum   += values[i].sum;
    }

    return {"count":count, "sum":sum};
  }
}

Upvotes: 1

JasonSmith
JasonSmith

Reputation: 73752

Your for loop in the reduce function is probably not doing what you think it is. For example, it might be throwing an exception that you did not expect.

You are expecting an array of 2-tuples:

// Expectation
values = [ [value1, total1]
         , [value2, total2]
         , [value3, total3]
         ];

During a re-reduce, the function will get old results from itself before.

// Re-reduce values
values = [ avg1
         , avg2
         , avg3
         ]

Therefore I would begin by examining how your code works if and when rereduce is true. Perhaps something simple will fix it (although often I have to log() things until I find the problem.)

function(keys, values, rereduce) {
  if(rereduce)
     return sum(values);

  // ... then the same code as before.
}

Upvotes: 2

Related Questions