Reputation: 693
I have a sample document in mongodb(and I am still new to mongodb)
{
"ID": 0,
"Facet1":"Value1",
"Facet2":[
{
"Facet2Obj1":{
"Obj1Facet1":"Value11",
"Obj2Facet1":"Value21",
"Obj3Facet1":"Value31"
}
},
{
"Facet2Obj2":{
"Obj1Facet2":"Value12",
"Obj2Facet2":"Value22",
"Obj3Facet2":"Value32"
}
},
{
"Facet2Obj3":{
"Obj1Facet3":"Value13",
"Obj2Facet3":"Value23",
"Obj3Facet3":"Value33"
}
}
],
"Facet3":"Value3"
"Facet4":{
"Facet4Obj1":{
"Obj1Facet1":"Value4111"
}
}
}
The Mapreduce is a little bit complex and it gives the following ouput(for 30,000 documents):
{
"_id" : "Facet1",
"value" : [
{
"value" : "Value1",
"count" : 30000,
"ID" : [
0,
1,
.
.
.
]
}
]
}
{
"_id" : "ID",
"value" : [
{
"value" : 0,
"count" : 1,
"ID" : [
0
]
},
{
"value" : 1,
"count" : 1,
"ID" : [
1
]
},
.
.
.
]
}
{
"_id" : "Facet2",
"value" : [
{
"value" : "Facet2Obj1",
"count" : 30000,
"ID" : [
0,
1,
.
.
.
]
},
{
"value" : "Facet2Obj2",
"count" : 30000,
"ID" : [
0,
1,
.
.
.
]
},
{
"value" : "Facet2Obj3",
"count" : 30000,
"ID" : [
0,
1,
.
.
.
]
}
]
}
{
"_id" : "Facet3",
"value" : [
{
"value" : "Value3",
"count" : 30000,
"ID" : [
0,
1,
2,
.
.
.
]
}
]
}
{
"_id" : "Facet4",
"value" : [
{
"value" : "Facet4Obj1",
"count" : 30000,
"ID" : [
0,
1,
2,
.
.
.
]
}
]
}
I inserted 30,000 documents using the format(with different IDs) into the mongodb, Then I did a map-reduce,but it was slow. With 30,000 documents it will take about 30 minutes , but then I put indexes with the facets it became faster a little bit, like it would take 350 seconds but with 50,000 documents it took again about 30 minutes. When I check the indexes using db.collection.getIndexes()
mongodb will return this output:
{
"v" : 1,
"key" : {
"_id" : 1
},
"ns" : "database.collection",
"name" : "_id_"
},
{
"v" : 1,
"key" : {
"ID" : 1,
"Facet1" : 1,
"Facet2" : 1,
"Facet3" : 1,
"Facet4" : 1
},
"ns" : "database.collection",
"name" : "ID_1_Facet1_1_Facet2_1_Facet3_1_Facet4_1"
}
Is there anything I did wrong with the indexes that the map-reduce is still not fast enough because Indexes must be strategically place or performance output will be the opposite
Answers are greatly appreciated and thanks in advance
Upvotes: 2
Views: 2858
Reputation: 42342
MapReduce passes every document in a collection into the map function except if you pass it {query: } option which it will use to "pre"-filter documents sent to MapReduce. You can also pass a {sort:} option to mapReduce and it will send documents to map function sorted on that field(s).
That's the only two places where indexes will be used - after that everything happens in the Javascript thread that's spawned for the work.
Upvotes: 5