ppn029012
ppn029012

Reputation: 570

How to query data efficiently in large mongodb collection?

I have one big mongodb collection (3-million docs, 50 GigaBytes), and it would be very slow to query the data even I have created the indexs.

db.collection.find({"C123":1, "C122":2})

e.g. the query will be timeout or will be extreme slow (10s at least), even if I have created the separate indexes for C123 and C122.

Should I create more indexs or increase the physical memory to accelerate the querying?

Upvotes: 6

Views: 22899

Answers (4)

navaltiger
navaltiger

Reputation: 883

  • Create Right indices and carefully use compound index. (You can have max. 64 indices per collection and 31 fields in compound index)
  • Use mongo side pagination
  • Try to find out most used queries and build compound index around that.
  • Compound index strictly follow sequence so read documentation and do trials
  • Also try covered query for 'summary' like queries

Learned it hard way..

Upvotes: 2

Poonam Agrawal
Poonam Agrawal

Reputation: 51

Use skip and limit. Run a loop for 50000 data at once .

https://docs.mongodb.com/manual/reference/method/cursor.skip/

https://docs.mongodb.com/manual/reference/method/cursor.limit/ example :

[
  {
    $group: {
      _id: "$myDoc,homepage_domain",
      count: {$sum: 1},
      entry: {
        $push: {
          location_city: "$myDoc.location_city",
          homepage_domain: "$myDoc.homepage_domain",
          country: "$myDoc.country",
          employee_linkedin: "$myDoc.employee_linkedin",
          linkedin_url: "$myDoc.inkedin_url",
          homepage_url: "$myDoc.homepage_url",
          industry: "$myDoc.industry",
          read_at: "$myDoc.read_at"
        }
      }
    }
  }, {
    $limit : 50000
  }, {
    $skip: 50000
  }
],
{
  allowDiskUse: true
},
print(
  db.Or9.insert({
    "HomepageDomain":myDoc.homepage_domain,
    "location_city":myDoc.location_city
  })
)

Upvotes: 1

Alan Mroczek
Alan Mroczek

Reputation: 1199

For such a query you should create compound indexes. One on both fields. And then it should be very efficient. Creating separate indexes won't help you much, because MongoDB engine will use first to get results of first part of query, but second if is used won't help much (or even can slow down in some cases your query because of lookup in indexes table and then in real data again). You can confirm used indexes by using .explain() on your query in shell.

See compound indexes:

https://docs.mongodb.com/manual/core/index-compound/

Also consider sorting directions on both your fields while making indexes.

Upvotes: 9

Sergio Tulentsev
Sergio Tulentsev

Reputation: 230286

The answer is really simple.

  1. You don't need to create more indexes, you need to create the right indexes. Index on field c124 won't help queries on field c123, so no point in creating it.

  2. Use better/more hardware. More RAM, more machines (sharding).

Upvotes: 2

Related Questions