Reputation: 856
We are currently using a collection called items
which contain 10 million entries in our MongoDB database.
This collection contains (amongst many others) two columns named title
and country_code
. One such entry looks like this
{
"_id": ObjectId("566acf868fdd29578f35e8db"),
"feed": ObjectId("566562f78fdd2933aac85b42"),
"category": "Mobiles & Tablets",
"title": "360DSC Crystal Clear Transparent Ultra Slim Shockproof TPU Case for Iphone 5 5S (Transparent Pink)",
"URL": "http://www.lazada.co.id/60dsc-crystal-clear-transparent-ultra-slim-shockproof-tpu-case-for-iphone-5-5s-transparent-pink-3235992.html",
"created_at": ISODate("2015-12-11T13:28:38.470Z"),
"barcode": "36834ELAA1XCWOANID-3563358",
"updated_at": ISODate("2015-12-11T13:28:38.470Z"),
"country_code": "ID",
"picture-url": "http://id-live.slatic.net/p/image-2995323-1-product.jpg",
"price": "41000.00"
}
The cardinality on column country_code
is very high. We have created two text indices for these columns:
db.items.createIndex({title: "text", country_code: "text"})
In our examples we try to query:
db.items.find({"title": { "$regex": "iphone", "$options": "i" }, country_code: "US"}).limit(10)
A query which takes around 6 seconds to complete, which seems unusually high for this type of database.
Whenever we try a country_code
(e.g., country_code: "UK") that has less results, it will return results within milliseconds.
Would there be any particular reason, why these queries differ so much in their time to return results?
EDIT: All answers here helped so if you have this issue yourself, try all of 3 of the solutions. Could only mark 1 as correct though.
Upvotes: 2
Views: 1991
Reputation: 1952
You should build an index like {country_code: 1, title: "text"}
.
Equal is much more faster than regular expression, make it count.
Upvotes: 0
Reputation: 59691
Switch around the order of the fields in your index. Order matters.
db.items.createIndex({country_code: "text", title: "text"})
Ensure you maintain this order when querying:
db.items.find({country_code: "US", "title": { "$regex": "iphone", "$options": "i" }}).limit(10)
What this will do is drastically decrease the amount of title
fields you need so search a substring for.
Also as mentioned by @Jaco, you should take advantage of your "text" index. See how to query a text index here.
Upvotes: 1
Reputation: 21766
As you do an exact search on country_code
, you can add the text index on title
only:
db.items.createIndex({title:"text"})
and add a seperate index on country_code
:
db.items.createIndex({country_code:1})
As you have defined a text
index on title
you don't have to use a regular expression, but instead you can do a text search like this:
db.items.find({$text:{$search:"iphone"},country_code:"US"})
Upvotes: 0