Reputation: 12538
I am working on building a simple arango query where if the user enters: "foo bar" (starting to type Foo Barber), the query returns results. The issue I am running in to is going from a normal single space separated string (i.e. imagine LET str = "foo barber" at the top), to having multiple wildcard queries like shown below.
Also, open to other queries that would work for this, i.e. LIKE, PHRASE or similar.
The goal is when we have a single string like 'foo bar', search results are returned for Foo Barber and similar.
FOR doc IN movies SEARCH PHRASE(doc.name,
[
{WILDCARD: ["%foo%"]},
{WILDCARD: ["%bar%"]}
], "text_en") RETURN doc
Upvotes: 0
Views: 528
Reputation: 11915
If you want to find Black Knight
but not Knight Black
if the search phrase is black kni
, then you should probably avoid tokenizing Analyzers such as text_en
.
Instead, create a norm
Analyzer that removes diacritics and allows for case-insensitive searching. In arangosh:
var analyzers = require("@arangodb/analyzers");
analyzers.save("norm_en", "norm", {"locale": "en_US.utf-8", "accent": false, "case": "lower"}, []);
Add the Analyzer in the View definition for the desired field (should be title
and not name
, shouldn't it?). You should then be able to run queries like:
FOR doc IN movies SEARCH ANALYZER(STARTS_WITH(doc.title, TOKENS("Black Kni", "norm_en")[0]), "norm_en") RETURN doc
FOR doc IN movies SEARCH ANALYZER(LIKE(doc.title, TOKENS("Black Kni%", "norm_en")[0]), "norm_en") RETURN doc
FOR doc IN movies SEARCH ANALYZER(LIKE(doc.title, CONCAT(TOKENS(SUBSTITUTE("Black Kni", ["%", "_"], ["\\%", "\\_"]), "norm_en")[0], "%")), "norm_en") RETURN doc
The search phrase Black Kni
is normalized to black kni
and then used for a prefix search, either using STARTS_WITH()
or LIKE()
with a trailing wildcard %
. The third example escapes user-entered wildcard characters.
Upvotes: 0