Reputation: 3311
I have a collection called Post. I have a mapping system that always ensures that each document has these fields:
This collection is accessed to be output in an API schema.
So some typical requests might be:
/post?type=image&user_id=2
/post?updated=35234423&order_by=client_id
/post?enabled=true&order_by=id
There is no 100% guarantee certain fields make it to the find or sort field.
Recently when the table reached 8GB of data, I started getting this error:
"localhost:27017: too much data for sort() with no index. add an index or specify a smaller limit"
I have looked at the documentation for Mongo index and found it difficult to understand whether it works in the same way as a MySQL index.
Some threads I found on indexing: MongoDB - too much data for sort() with no index error seem to suggest using specific sort fields to ensure the index is hit. Obviously I cannot do that when alot of my filtering and sorting is optional.
Could anyone suggest what a firm solution would be in terms of whether I should index all fields on my table?
Thanks for the feedback guys, I've started building an auto index function:
public function get() {
$indices['Post'] = array(
'fields' =>
array(
'id' => array('unique' => true, 'dropDups' => true, 'background' => true),
'client_id' => array('dropDups' => true, 'background' => true),
'image_id' => array('dropDups' => true, 'background' => true),
'user_id' => array('dropDups' => true, 'background' => true),
'publish_target' => array('dropDups' => true, 'background' => true),
'type' => array('dropDups' => true, 'background' => true),
'status' => array('dropDups' => true, 'background' => true),
'text' => array('background' => true)
)
);
foreach ($indices as $key => $index) {
/* set the collection */
$collection = $this->mongoDB->{$key};
/* delete the indexes */
$collection->deleteIndexes();
/* loop the fields and add the index */
foreach ($index['fields'] as $subKey => $data) {
$collection->ensureIndex($subKey, array_merge($data, array('name' => $subKey)));
}
}
/* return the list */
return $indices;
}
Upvotes: 0
Views: 301
Reputation: 43884
Unfortunately I cannot think of a really good solution to such a dynamic nature with indexes however, this JIRA https://jira.mongodb.org/browse/SERVER-3071 would really help you.
I suggest you watch that JIRA ticket.
Upvotes: 1
Reputation: 36794
You should know upfront what sort of queries are going to hit the server. Without that you can't do any optimisation and might run into sort issues like you have now.
If you say that the users get to sort by any of the 9 fields that you have, you will need to create an index on each of them. However you need to remember that sometimes it makes more sense to create a compound index, as preventing the issue for:
/post?updated=35234423&order_by=client_id
Can only done by setting an index on:
{ updated: 1, client_id: 1 }
Indexes in MongoDB can only be used if all left-hand side fields in the index are part of your query.
So: { updated: 1, client_id: 1 }
works optimal for:
find( { 'updated' : 1 } );
find( { 'updated' : 1, 'client_id' : 1 } );
find( { 'updated' : 1 } ).sort( { 'client_id' : 1 } );
But not for:
find( { 'client_id' : 1 } );
find( { 'client_id' : 1 } ).sort( { 'updated' : 1 } );
In order to reduce the amount of data and to prevent your error message you can also additionally add a limit()
to each query. With 8MB of results, I doubt your UI can show that many results anyway, so using limit()
might make sense.
Upvotes: 2