mpmp
mpmp

Reputation: 2459

Couchbase: What benefits do I get from using the document ID?

I'm new to the NoSQL world as I've been programming RDBMS for a while now. In an RDBMS, you have the notion of a PRIMARY KEY per table. You reference other tables using FOREIGN KEYs and usually, if denormalized well, you have another table that just basically contains mapping from TABLE A and TABLE B so you can join them.

In Couchbase, there's this concept of a Document ID where a document has it's own unique key external from the document itself. What is this document ID good for? The only use I see for it is querying for the object itself (using USE KEYS clause).

I could just specify an "id" and "type" in my JSON document and just assign random UUIDs for the document key.

What benefits do I get from using it? ELI5 if possible.

And also, why do some developers add "prexifes" to the document ID (e.g customer:: customername".

Upvotes: 3

Views: 495

Answers (1)

EbenH
EbenH

Reputation: 566

That is an excellent question, and the answer is both historical and technical.

Historical: Couchbase originated from CouchOne/CouchDB and Membase, the latter being a persistent distributed version of the memcached key-value store. Couchbase still operates as a key-value store, and the fastest way to retrieve a document is via a key lookup. You could retrieve a document using an index based on one of the document fields, but that would be slower.

Technically, the ability to retrieve documents extremely quickly given their ID is one advantage that makes Couchbase attractive for many users/applications (along with scalability and reliability).

Why do some developers add "prefixes" to document IDs, such as "customer::{customer name}". For issues related to fast retrieval and data modeling. Let's say you have a small document containing a customer's basic profile, and you use the customer's email address as the document ID. The customer logs in, and your application can retrieve this profile using a very fast k-v lookup using the e-mail as ID. You want to keep this document small so it can be retrieved more quickly.

Maybe the customer sometimes wants to view their entire purchase history. Your application might want to keep that purchase history in a separate document, because it's too big to retrieve unless you really need it. So you would store it with the document id {email}::purchase_history, so you can again use a k-v lookup to retrieve it. Also, you don't need to store the key for the purchase history record anywhere - it is implied. Similarly, the customer's mailing addresses might be stored under document ID {email}::addresses. Etc.

Data modeling in Couchbase is just as important as in a traditional RDBMS, but you go about it differently. There's a nice discussion of this in the free online training: https://training.couchbase.com/online?utm_source=sem&utm_medium=textad&utm_campaign=adwords&utm_term=couchbase%20online%20training&utm_content=&gclid=CMrM66Sgw9MCFYGHfgodSncCGA#

Why does Couchbase still use an external key instead of a primary key field inside the JSON? Because Couchbase still permits non-JSON data (e.g., binary data). In addition, while a relational database could permit multiple fields or combination of fields to be candidate keys, Couchbase uses the document ID for its version of sharding, so the document ID can't be treated like other fields.

Upvotes: 4

Related Questions