Reputation: 533
In the View Collation chapter of the CouchDB official documentation under the first example (http://docs.couchdb.org/en/1.6.1/couchapp/views/collation.html#views-collation), it is suggested that it is not recommended to emit the document itself in the view and instead, it is suggested to include the bodies of the documents when requesting the view, by requesting the view with ?include_docs=true
.
If I understood it correctly, instead of:
emit(doc._id, doc);
and getting results in the following format:
{"id":"1","key":"1","value":{"_id": "1", "someProp": "someVal"}},
it is suggested to send emits with null values:
emit(doc._id, null)
and then when querying my view with the include_docs parameter get results in the following format:
{
"id": "1",
"key": "1",
"value": null,
"doc": {
"_id": "1",
"_rev": "1-0eee81fecb5aa4f51e285c621271ff02",
"someProp": "someVal"
}
If it is suggested, than I would presume the performance of that would be better, but unfortunately the documentation doesn't elaborate why and other examples emit documents normally as value in the emit. Could anyone shed more light on this?
Upvotes: 3
Views: 541
Reputation: 28439
When you emit
the entire document in a view, you are effectively duplicating the document on disk. This is because each view has it's own file that includes the results of running the view on the database. Thus, if you have 3 views where you output your document, you have 4 copies floating around. (not counting multiple revisions of documents, which of course adds more duplicates)
CouchDB uses disk-space very liberally in order to make writes occur faster, largely due to their choice to use an append-only structure. As a result, using views to output the same document repeatedly can cause your disk-usage to grow very quickly. (compacting your database and views generally helps, but it should not be something you want to force yourself into constantly)
The trade-off to leaving the documents out is that when you are reading from the view, CouchDB will need to internally find the document and include it in the view's output. Since it is looking things up based on the id, it's a very fast operation, but it still incurs overhead. Thus, while this pattern is generally best-practice, you should be open to examining the trade-off in the context of your application.
Upvotes: 6