Reputation: 6121
Let's say I have about 1Tb of SQL email data that has the following fields:
email, email_host, username, first_name, last_name, last_login
I'm told that the MySQL queries like SELECT * FROM emails WHERE email_host = 'gmail.com'
are running a bit slow... we're talking 30 minutes or more.
I'd like to try out Couchbase but I'm not sure how to go about denormalizing the data and building the views
. It seems that one possible approach as per Couchbase's own video tutorial sessions is to create a "key," like:
u::gmail::incremented_id
or
u::john::incremented_id
then only fetch all the u::gmail
s to have a list of keys corresponding to gmail addresses, instead of querying all of the documents and checking if the email_host
attribute is gmail
.
Is this a legitimate / good strategy to go about denormalizing this sample data set and if so, what is the proper technique to create a view to preindex u::whatever
?
Upvotes: 0
Views: 113
Reputation: 3715
You just need a map function like this:
function(doc) {
if (doc.type == 'user')
emit(doc.email_host, doc);
}
Then you query the view using as key gmail.com
to get all the users using gmail. If the database contains other document types, you better include a type attribute in your documents to distinguish different types.
Upvotes: 1
Reputation: 425
I would just add email_host to your document and give it the value behind the '@'. That way all documents have the email host as a field and can be easily queried. Should be high performing.
Upvotes: 0