Nearpoint
Nearpoint

Reputation: 7362

Understanding mongodb indexing: Indexing linked document that is frequently queried

Here is a simple example of the kind of query my app makes

So you have a notepad and notes collection. The note documents are linked to the notepad document using the notepadId.

So a typically query looks like:

Notes.find({notepadId: notepad._id})

A user can have hundreds of notes per notepad. And they switch between notepads often. Also users collaborate on notepads.

I also have a mechanism that counts all the notes in the notepad on the server on page load. This is because I need to know how many pages to show since I use pagination.

Notes.find({notepadId: notepad._id}).count()

I am not very familiar with MongoDB indexing and have tried to search for 'Indexing linked documents' and can't find any info on indexing linked documents.

Is this a common practice to index linked document fields?

So in this case I would want to set an index on the Notes collection, on the field notepadId

Good idea, bad idea, why?

Upvotes: 0

Views: 75

Answers (2)

JJussi
JJussi

Reputation: 1580

Those answers where (index covered query) was talked about, forgot to tell that, to get fully index covered result, you must do projection what excludes _id -field.

Let me explain.

In full index covered query, we read ONLY index and not anything else from disk. If we have f.ex. Notes.createIndex({notepadId: 1, archived: 1}) (what covers both of those two queries), and we do Notes.find({notepadId: notepad._id}).count() there will be two different actions. First there is "find" what collects result set and then "count" what counts how many documents was in that result set.

Find ALWAYS includes (in it's result set) "_id" field, if it is NOT excluded in projection. Because our index don't have _id -field, only pointer to documents disk location, find first finds required document using index and then jumps to disk location where that index entry points, to read document's _id. Not what we wanted. To prevent this, we need projection. So query:

Notes.find({notepadId: notepad._id},{"_id":0, "notepadId":1}).count()

will return (in find section) only list of "notepadId"'s and not anything else.

The other query would be: Notes.find({notepadId: notepad._id, archived: false},{"_id":0, "notepadId":1, archived:1}).count()

You can test this easily doing two different queries:

Notes.find({notepadId: notepad._id}).explain()
Notes.find({notepadId: notepad._id},{"_id":0, "notepadId":1}).explain()

and comparing results.

Upvotes: 0

Alex Blex
Alex Blex

Reputation: 37108

The documents become "linked" only on application layer. There is nothing special in notepadId field in mongodb itself. Indexing by this field will make counting extremely efficient, as it is a covered query, which requires no disk IO.

Upvotes: 1

Related Questions