Reputation: 14290
I've created a search function and for this I created a new indexer and searcher for that. The problem is when I enter a search query with a white space in it. Example below.
I've got this persons created and stands inside my index:
Person number | First name | Last name |
---|---|---|
1 | Ilse | Van de Burg |
2 | Devolder | Marlijn |
I've tried next queries:
Query number | Term | Actual result* | Accepted result* |
---|---|---|---|
1 | van | 1 | 1 |
2 | van de | 1 | 1 |
3 | ilse | 1 | 1 |
4 | van de burg | 1 | |
5 | van de burg ilse | 1 | |
6 | de | 1 & 2 | 1 & 2 |
7 | devolder | 2 | 2 |
8 | devolder marlijn | 2 | |
9 | marijn devolder | 2 |
* number of the person. if empty: nothing found or accepted
Some queries are not what I accepted. How could I solve this?
Here is my code I've made:
BaseSearchProvider searcher = ExamineManager.Instance.SearchProviderCollection["PersonSearcher"];
ISearchCriteria searchCriteria = searcher.CreateSearchCriteria(BooleanOperation.Or);
ISearchCriteria query = searchCriteria.Field("lastname", term.MultipleCharacterWildcard()).Or()
.Field("firstname", term.MultipleCharacterWildcard()).Or()
.OrderBy("lastname", "firstname").Compile();
return searcher.Search(query);
Examine index
<IndexSet SetName="Artsen" IndexPath="~/App_Data/TEMP/ExamineIndexes/Artsen/">
<IndexAttributeFields>
<add Name="id" Type="int" />
<add Name="nodeName" />
<add Name="nodeTypeAlias" />
</IndexAttributeFields>
<IndexUserFields>
<add Name="email" />
<add Name="fax" />
<add Name="naam" EnableSorting="true" />
<add Name="onderzoeken" Type="int[]" />
<add Name="specialismen" Type="int[]" />
<add Name="subspecialismen" Type="int[]" />
<add Name="telefoon" />
<add Name="titel" EnableSorting="true" />
<add Name="voornaam" EnableSorting="true" />
<add Name="website" />
</IndexUserFields>
<IncludeNodeTypes>
<add Name="arts" />
</IncludeNodeTypes>
</IndexSet>
Examine settings (examine index provider):
<add name="ArtsenIndexer" type="UmbracoExamine.UmbracoContentIndexer, UmbracoExamine" supportUnpublished="false"
supportProtected="true" indexSet="Artsen"
analyzer="Lucene.Net.Analysis.Standard.StandardAnalyzer, Lucene.Net"/>
Examine settings (examine search provider):
<add name="ArtsenSearcher" type="UmbracoExamine.UmbracoExamineSearcher, UmbracoExamine" supportUnpublished="false"
supportProtected="false" indexSet="Artsen" enableLeadingWildcard="true"
analyzer="Lucene.Net.Analysis.Standard.StandardAnalyzer, Lucene.Net"/>
I've also tried this and got the best results:
query = searchCriteria.GroupedOr(new List<string>() { "naam" }, term.MultipleCharacterWildcard(), term.Escape()).Or()
.GroupedOr(new List<string>() { "voornaam" }, term.MultipleCharacterWildcard(), term.Escape()).Or()
.GroupedOr(new List<string>() { "titel" }, term.MultipleCharacterWildcard(), term.Escape()).Or()
.OrderBy("naam", "voornaam").Compile();
When I do a ToString()
of searchCriteria
of query above and searched on van de burg
, it gives me this:
{ SearchIndexType: , LuceneQuery: (naam:van de burg* (naam:van de burg)) (voornaam:van de burg* (voornaam:van de burg)) (titel:van de burg* (titel:van de burg)) }
The problem here is when I get two persons with the same last name. For example:
Person number | First name | Last name |
---|---|---|
3 | Marc | De Vadder |
4 | Freddy | De vadder |
Search results:
The results 1 till 9 are all good.
Query number | Term | Actual result* | Accepted result* |
---|---|---|---|
10 | de vadder | 3 & 4 | 3 & 4 |
11 | de vadder freddy | 3 & 4 | 4 |
11 | de vadder marc | 3 & 4 | 3 |
* number of the person. if empty: nothing found or accepted
Upvotes: 1
Views: 1282
Reputation: 1728
Looking at your results all is good as you're searching for term in First Name OR Last Name OR Title, so you're getting results which contain elements of phrase in those fields.
As Examine is not fully supporting phrase queries, my suggestion would be to create searchable field which will store all of those fields combined and build a query against this field where we'll be looking for exact terms from phrase (not the whole phrase itself). It might get tricky also as maybe you're not able to control order of the fields and results can get inconsistent too. Worth to play with it.
Sample code demonstrating this behaviour might be like this:
if (searchTerm.Contains(" "))
{
string[] terms = searchTerm.Split(' ');
examineQuery.And().GroupedOr(new List<string> { SearchableFieldToSearch }, terms);
}
Second option might be a separation of fields in the search form itself (separated inputs for the first name, last name and title - of course if it's possible) and building query with GroupedAnd operation.
criteria.GroupedAnd(new[] { "naam", "voornaam", "titel" }.ToList(), new[] { firstName, lastName, title });
You can read more about Grouped Operations in documentation here: https://github.com/Shazwazza/Examine/wiki/Grouped-Operations.
If none of the above will work, maybe it would be worth to build a query with custom boosting and just trim/strip out the results with score lower than expected.
Hope it will help you and point to the right direction. Share your results :)
Upvotes: 4