wayoutmind
wayoutmind

Reputation: 416

Return unique values by key in CouchDB

Is there a way to do the following in CouchDB? A way to return unique, distinct values by a given key?

SELECT DISTINCT field FROM table WHERE key="key1"

'key1' => 'somevalue'
'key1' => 'somevalue'
'key2' => 'anotherval'
'key2' => 'andanother'
'key2' => 'andanother'

For example:

http://localhost:5984/database/_design/designdoc/_view/distinctview?key="key1" would return ['somevalue']

http://localhost:5984/database/_design/designdoc/_view/distinctview?key="key2" would return ['anotherval', 'andanother']

Upvotes: 10

Views: 11733

Answers (3)

J. Reynolds
J. Reynolds

Reputation: 140

    map: (doc, emit) => {
         emit(doc.tag, doc.value);
    },
    reduce: (keys, values, rereduce) => {
    
         if (rereduce) {
              let res = {};
              values.forEach(v => {
                 res = {
                       ...res,
                       ...v
                     }
              })
          }
          else {
               const res = {};
               keys.forEach((key, idx) => res[key[0] + ':' + values[idx]] = true);
               return res;
          }
    
    }

yields the following:

key1:somevalue: true
key2:andanother: true
key2:anotherval: true

which one can just extract.

Upvotes: 0

Marcello Nuccio
Marcello Nuccio

Reputation: 3901

As suggested in the CouchDB definitive guide, you should put the values you want to be unique in the key, then query the reduce function with group=true.

For example, given that keyfield is the field with "key1" and "key2" and valuefield is the field with the values, your map function could be:

function(doc) {
  // filter to get only the interesting documents: change as needed
  if (doc.keyfield && doc.valuefield) {
    /*
     * This is the important stuff:
     * 
     *  - by putting both, the key and the value, in the emitted key,
     *    you can filter out duplicates
     *    (simply group the results on the full key);
     * 
     *  - as a bonus, by emitting 1 as the value, you get the number
     *    of duplicates by using the `_sum` reduce function.
     */
    emit([doc.keyfield, doc.valuefield], 1);
  }
}

and your reduce function could be:

_sum

Then querying with group=true&startkey=["key2"]&endkey=["key2",{}] gives:

{"rows":[
{"key":["key2","anotherval"],"value":1},
{"key":["key2","andanother"],"value":2}
]}

Upvotes: 11

Dominic Barnes
Dominic Barnes

Reputation: 28429

Based on what I see here, (I'll change my answer if needed) key1 and key2 look like independent fields, so you'll need 2 separate views.

I created 5 simple documents in my test database:

// I've left out fields like _id and _rev for the sake of simplicity
{ "key1": "somevalue" }
{ "key1": "somevalue" }
{ "key2": "anotherval" }
{ "key2": "andanother" }
{ "key2": "andanother" }

Here are the 2 view queries you'll need:

// view for key1
function(doc) {
  if (doc.key1) {
    emit("key1", doc.key1);
  }
}

// view for key2
function(doc) {
  if (doc.key2) {
    emit("key2", doc.key2);
  }
}

From there, your reduce function can return all the values in an array by just doing this:

function (key, values) {
  return values;
}

However, you specifically mentioned distinct values. Since JavaScript doesn't have a native unique() method for arrays, and we can't use CommonJS modules in view functions, we'll have to add our own logic for that. I just copy-pasted the first array.unique() function I found on Google, you can write your own that is better optimized for sure.

function (key, values, rereduce) {
  var o = {}, i, l = values.length, r = [];

  for (i = 0; i < l; i += 1) {
    o[values[i]] = values[i];
  }

  for (i in o) {
    r.push(o[i]);
  }

  return r;
}

You'll use this same reduce function in both views. When you query any of those views, by default it will also perform the reduce. (You'll need to explicitly pass reduce=false to get just the results of your map function.

Here are the result-sets you'd retrieve using the above map/reduce queries: (remember they are 2 separate queries)

{"rows":[
  {"key":"key1", "value": ["somevalue"]}
]}

{"rows":[
  {"key": "key2", "value": ["anotherval", "andanother"]}
]}

Upvotes: 3

Related Questions