user791770
user791770

Reputation: 249

In CouchDB, how do I find the most frequently occurring value?

I am trying to classify levels of aggregation by finding the most frequently occurring value of a particular field in the documents that are reduced to a given level.

I have documents like this:

{ year: 2012,
  month: 01,
  category: blue
},

{ year: 2012,
  month: 01,
  category: blue
},

{ year: 2012,
  month: 01,
  category: blue
},

{ year: 2012,
  month: 01,
  category: green
}

The map function basically emit's these documents back out with keys as [year, month] (though I could include the category if needed). I the reduce to then reduce down to the most frequently occurring category.

In the case of my examples above, group=false, level_1, and level_2 should all reduce to "blue".

I thought of trying to change the key to [year, month, category] with the hopes that I could count the category values as I moved up the aggregation. But that doesn't seem to work.

How would I find the most frequently occurring value for category? I feel like the answer is simple, but I'm just not connecting the dots.

Thanks.

Upvotes: 3

Views: 118

Answers (1)

sinm
sinm

Reputation: 1377

It's simple but not concise as i worked it out.

{
   "views": {
       "most_category": {
           "map": "function(doc){
             if (doc.category && doc.year && doc.month) {
                var hash = {};
                hash[doc.category] = 1;
                emit([doc.year, doc.month], hash);
             }
           }",
           "reduce": "function(keys, values, rereduce) {
              var agg = values[0];
              for (var i = 1; i < values.length; ++i) {
                for (var category in values[i]) {
                  if (agg[category]) {
                    agg[category] += values[i][category];
                  } else {
                    agg[category] = values[i][category];
                  }
                }
              }
              var most_category = null;
              var most_count = 0;
              for (var category in agg) {
                if (most_count<agg[category]) {
                  most_category = category;
                  most_count = agg[category];
                }
              }
              var hash = {};
              hash[most_category] = most_count;
              return hash;
           }"
       }
   }
}

Upvotes: 1

Related Questions