Patrick
Patrick

Reputation: 8063

CouchDB / Couchbase view ordered by number of keys

I'm trying to write a view which shows me the top 10 tags used in my system. It's fairly easy to get the amount with _count in the reduce function, but that does not order the list by the numbers. Is there any way to do this?

function(doc, meta) {
  if(doc.type === 'log') {
    emit(doc.tag, 1);
  }
}
_count

As a result I'd like to have:

Instead of

Most importantly, I do not want to transfer the full set to my application server and handle it there.

Upvotes: 1

Views: 642

Answers (2)

m03geek
m03geek

Reputation: 2538

In couchbase you can't sort result in/after reduce, so you can't directly get "Top 10" of something. In couchbase views values are always sorted by key. The best way is:

  1. Query your view that returns key-value pair: tag_name - count_value ordered by tag_name
  2. Create job that runs every N minutes, that gets results from [1], sorts them, and writes sorted results to separate key (i.e. "Top10Tags").
  3. In your app you query key Top10Tags.

This could reduce traffic, but results can be outdated. Also you can create that "job" on same server that couchbase runs (i.e. write small node.js app or something else) and it counsume just loopback traffic and small cpu amount for sorting every N mins.

Also, if you're using _count reduce function, you don't need to emit any numbers, use just null:

function(doc, meta) {
  if(meta.type === "json" && doc.type === 'log') {
    emit(doc.tag, null);
  }
}

And if you want to have docs tagged by multiple tags like

{
  "type": "log",
  "tags": ["tag1","tag2","tag3"]
}

Your map function should be:

function(doc, meta) {
  if(meta.type === "json" && doc.type === 'log') {
    for(var i = 0; i < doc.tags.length; i++){
      emit(doc.tags[i], null);
    }
  }
}

One more thing about that top10 list. You can store it in memcache bucket if you don't want to store it on disk.

Upvotes: 2

ddouglascarr
ddouglascarr

Reputation: 1442

Something you think would be easy but isn't really.

In couchdb, I'd use a list function, and order the results with JavaScript sort(). That way it's all sorted on the server side, and you can have the list only return the top 10.

Bare in mind that with large data sets this will be slow.

Upvotes: 0

Related Questions