Reputation: 3183
I have a compound index on a table on which map reduce runs
db.jobs.createIndex({
Name: "text",
Line1: "text",
City: "text",
State: "text",
Zip: "text",
PropertyId: "text",
Line2: "text",
JobId: 1,
JobOwner: 1,
Amount: 1
}, {
weights: {
Name: 100,
},
name: "custom_text_index"
})
And there is an entry where Line1 ,which has a text index, is around 370KB and because of this mapreduce is failing with below errors
2018-04-22T13:34:50.666+0000 E QUERY [thread1] Error: map reduce failed:{
"code" : 17280,
"ok" : 0,
"errmsg" : "MR parallel processing failed: { ok: 0.0, errmsg: \"WiredTigerIndex::insert: key too large to index, failing 371495 { : { Agency_Id: 190.0, PropertyId: \"070720762\", Name: \"MOUNT SINAI SCHOOL OF M...\", code: 17280, codeName: \"KeyTooLong\" }"
Though mongo doc says text index can be large, is it still valid in case of above compound index? or is it subject to index key limitation of 1024 bytes?
Upvotes: 0
Views: 477
Reputation: 10918
The 1024 bytes limitation applies to all index entries. The documentation states
The total size of an index entry, which can include structural overhead depending on the BSON type, must be less than 1024 bytes.
and also
MongoDB will not insert into an indexed collection any document with an indexed field whose corresponding index entry would exceed the index key limit, and instead, will return an error. Previous versions of MongoDB would insert but not index such documents.
That explains the error you are seeing.
Right next to the statement that you cited from the documentation
text indexes can be large.
there's the following, too:
They contain one index entry for each unique post-stemmed word in each indexed field for each document inserted.
So the whole text index can be larger than 1024 bytes, each individual index entry in it, however, must not.
That's why I would tend to think that inside your 370kb Line1
entry there is a word that is longer than 1024 bytes.
In order to exclude the compound index as a potential culprit you could as well change the index to only index the Line1
entry and see how that goes:
db.jobs.createIndex({
Line1: "text"
}, {
weights: {
Name: 100,
},
name: "custom_text_index"
})
Upvotes: 1