Reputation: 574
I have a database containing product information (SKU, model number, descriptions, etc) and I'd like to have a relatively quick search function where a user can just type in a few letters or a word from any of the the text fields and then get a list of products that contain that phrase in any of those fields.
The number of items in the database will probably not be more than 100,000.
What would be the easiest way to accomplish this, without creating complex queries?
Upvotes: 2
Views: 678
Reputation: 1717
It sounds like you're looking for an autocomplete. There are numerous ways to do this.
No matter the solution you choose, you'll want to put some indices on your data. I recommend adding a skiplist to everything you're going to be searching, and an additional fulltext index on any long-form text (such as product description). String comparison uses skiplists, while only a FULLTEXT
search will leverage a fulltext index.
You have some choices here.
https://docs.arangodb.com/3.11/aql/functions/string/#like
You could run your search something like:
for product in warehouse
filter like(product.model, @searchTerm, true) or
like(product.sku, @searchTerm, true)
return product
Advantage: simple query syntax, multiple attributes in one search, supports substrings, can search the middle of a body of text.
Disadvantage: relatively slow.
This is a lot more complex for querying, but is very responsive, and is the approach my application uses for its autocomplete.
let sku = (for result in fulltext("warehouse", "sku", "prefix:@seacrhTerm")
return {sku: result.sku, model: result.model, description: result.description}
let model = (for result in fulltext("warehouse", "model", "prefix:@searchTerm")
return {sku: result.sku, model: result.model, description: result.description}
let description = (for result in fulltext("warehouse", "description", "prefix:@searchTerm")
return {sku: result.sku, model: result.model, description: result.description}
let resultsMatch = union(sku,model,description)
return resultsMatch
Advantage: Very fast, extremely responsive, can handle very long bodies of text with ease, searches anywhere in a text body.
Disadvantage: Complex query structure as you need one variable for every attribute you're searching, a fulltext index created on each of those attributes you're searching, and a union at the end. You may need to do a union of the unioned results depending on how advanced your search needs to be. Doesn't support substring searching.
Simply create a query that filters for results to be greater than or equal to your search term, but less than your search term with the last letter incremented by 1. Example is in the link under the Foxx portion of my answer. This leverages skiplists.
Advantage: Very fast as long as the field is not tremendously long. Extremely easy to implement.
Disadvantage: Doesn't support substring searches. Only searches the first part of a string. I.e. you must know the beginning of the field you're searching.
This will work very well for quickly searching something like a model number where your users will probably know the beginning of it, but poorly for something like a description in which your users are probably searching for words somewhere in the middle of a body of text.
Jan's little Cookbook example is a good place to start:
I would recommend abstracting whatever you do into a Foxx service. It is especially liberating if you need to dynamically build up AQL queries in database, in case you have a huge number of fields and collections to search and you need to generate a Fulltext search dynamically.
Experiment and see which of these works best for you. My best guess is that you will find the Fulltext solution the best if you need to search on product descriptions. If you expect your users to always search the first few letters of a field, just use the comparison with a skiplist as it is very very fast.
Upvotes: 3