Wayne Ye
Wayne Ye

Reputation: 2514

Mongo “manual reference” performance compare to traditional DB’s “table joining”

According to the official document: "manual reference" operation is usually preferred, experienced guy even suggest never use DBref, then I am seriously concerning how much of the performance penalty to do twice query when I want to query entities with relational collection, especially comparing with the traditional relational DB - we can retrieve the expected result within one query using table joins.

Denormalize example:

db.blogs.insert({
  _id: 1,
  title: "Investigation on MongoDB",
  content: "some investigation contents",
  post_date: Date.now(),
  permalink: "http://foo.bar/investigation_on_mongodb",
  comments: [
    { content: "Gorgeous post!!!", nickname: "Scott", email: "[email protected]", timestamp: "1377742184305" },
    { content: "Splendid article!!!", nickname: "Guthrie", email: "[email protected]", timestamp: "1377742184305" }
  ]}
               )

We can simply use: db.blogs.find() to get everything we want: blog posts with comments belong to them.

Normalize example:

db.books.insert({ 
  _id: 1, 
  name: "MongoDB Applied Design Patterns", 
  price: 35, 
  rate: 5, 
  author: "Rick Copeland",
  ISBN: "1449340040",
  publisher_id: 1,
  reviews: [
    { isUseful: true, content: "Cool book!", reviewer: "Dick", timestamp: "1377742184305" },
    { isUseful: true, content: "Cool book!", reviewer: "Xiaoshen", timestamp: "1377742184305" }
  ]
  } 
); 
  
db.publishers.insert({ 
  _id: 1, 
  name: "Packtpub INC", 
  address: "2nd Floor, Livery Place 35 Livery Street Birmingham",
  telephone: "+44 0121 265 6484",
  } 
);

Now if I want to get the complete information about a single book I have to manually query twice, similar to below:

> var book =  db.books.find({ "name": { $regex: 'mongo*', $options: 'i' } })
> db.publishers.find({ _id: book.publisher_id })

Things I know is: the precedence operations will be process by Mongo "in memory", but I will have the summarized question below:

In simply words: document oriented database advocates "denormalize" data to retrieve result within one query, however, when we have to store relational data, it "suggest" you to use "manual reference", which means twice query, while in relational DB there will be only one time query by using "table joining".

This makes no sense for me:)

Upvotes: 2

Views: 448

Answers (1)

Philipp
Philipp

Reputation: 69703

A relational database also performs a JOIN by querying both tables. But it has the advantage that it can do this internally and doesn't have to communicate with the client to do this. It can query the 2nd table immediately.

MongoDB first needs to send the results of the first query to the client before the application can formulate and send the 2nd query back to the database. The time lost by this is:

  1. Network latency between database server and application server (a couple ms)
  2. Interprete the response on the application server and generate a $in-query from it (a couple µs)
  3. Network latency between application server and database server (a couple ms)

Depending on how well the application server and the database server are interconnected, we are talking about a penalty of a few ms here.

Upvotes: 1

Related Questions