Reputation: 2116
My question is: what is the best practice for querying a MongoDB when you need to do either a fuzzy text search and sort by search score, or do an unfiltered query that sorts by a property, but both are required to return a pagination token?
To elaborate, I have a MongoDB with documents that contain a name
and description
property and I have the following business requirements:
description
field. Return the first 10 results, sorted by the fuzzy match score. Allow pagination via a "Show More" button.name
. Allow pagination via a "Show More" button.I have hacked together the following solution. Basically doing the fuzzy search, or doing a .*
regex search if there has been no input from the user. This seems like it will be bad for performance once the collection grows, but it was the only way I could think to still utilize a $search
in both cases so I could perform pagination without using skip/limit.
pipeline = []
if search:
pipeline.append({
'$search': {
'index': 'autocomplete_index',
'autocomplete': {
'path': 'description',
'query': search,
'tokenOrder': 'any',
'fuzzy': {
'maxEdits': 2,
'prefixLength': 1,
'maxExpansions': 256
}
},
"sort": {'unused': {'$meta': "searchScore"}},
}
})
else:
pipeline.append({
'$search': {
'index': 'regex_index',
'regex': {
'path': 'name',
'query': '.*',
'allowAnalyzedField': True
},
"sort": { 'name': 1 }
}
})
if pagination_token:
for stage in pipeline:
if '$search' in stage:
stage['$search']['searchAfter'] = pagination_token
pipeline.append({
"$limit": page_size
})
pipeline.append({
"$project": {
"_id": 1,
"name": 1,
"description": 1,
"pagination_token": { "$meta" : "searchSequenceToken" }
}
})
results = self.collection.aggregate(pipeline)
I have had to create the following indexes to support the current solution
autocomplete_index
{
"mappings": {
"dynamic": false,
"fields": {
"description": {
"type": "autocomplete"
}
}
}
}
regex_index:
{
"mappings": {
"fields": {
"name": [
{
"type": "token"
},
{
"type": "string"
}
]
}
}
}
Upvotes: 0
Views: 289