Reputation: 249
First time I am using mongodb, I need help with index suggestions. In my mongo database I have collection with nearly 15 million documents.Attached is json document. The "playfields" is array of embedded documents . I frequently run following queries. The data in value field is stored in different datatypes (string, int). My question is what is best index to add on this documents to cover below queries:
db.playfieldvalues.find({"playfields":{$elemMatch:{ID:"Play.NHL.NHLEventX", value: -47}}})
Sample JSON Doc:
{
"Playid": "9dbc7763-7d47-44e9-8b11-e1ec864407bb",
"Playfields": [
{
"ID": "XInfo_OffTeam",
"Value": "Dallas",
"TS": "201709291148408452"
},
{
"ID": "XInfo_DefTeam",
"Value": "Kansas",
"TS": "201709291148456267"
},
{
"ID": "XInfo_Period",
"Value": 1,
"TS": "201709291148456267"
},
{
"ID": "XInfo_OffScore",
"Value": 38,
"TS": "201709291148456267"
},
{
"ID": "XInfo_DefScore",
"Value": 21,
"TS": "201709291148456267"
},
{
"ID": "XInfo_PlayAction",
"Value": "Y",
"TS": "201709291148456267"
}
]
}
Upvotes: 1
Views: 2191
Reputation: 47865
If your queries on Playfields
always use the $elemMatch
operator to address a pairing of ID
and Value
then creating a composite index on Playfields.ID
, Playfields.Value
sounds sensible:
db.collection.ensureIndex({ "Playfields.ID": 1, "Playfields.Value": 1 })
But ... any index creation should probaby be accompanied by some empirical testing. Run your queries with explain()
to understand whether your index is actually chosen and, if so, how selective it is:
db.playfieldvalues.find({ "playfields": { $elemMatch: { ID: "Play.NHL.NHLEventX", Value: -47 } } })
.explain()
There are plenty of details in the docs on the contents of the 'explain document' but the first things to check are:
Crudely speaking the closer 'keys examined' is to the number of docs returned the more index coverage you have achieved.
If your index has not been chosen by MongoDB then you should review the details provided by the explain plan on rejected plans and why they were rejected.
There is also the question of the cost of an index (in terms of impact on write times and index storage) so I'd suggest considering your non functional requirements - can your desired elapsed times be achieved without a dedicated index? If not, then you should proceed with empirical testing but be prepared to tweak your choice in reponse to what the explain()
output tells you.
Upvotes: 2