Reputation: 15365
I'm developing a building repository query.
Here is the query that I am trying to write.
(Exact match on
zipCode
) AND ((Case-insensitive exact match onaddress1
) OR (Case-insensitive exact match onsiteName
))
In my repository, I have a document that looks like the following:
address1
: 4 Myrtle Street
siteName
: Myrtle Street
zipCode
: 90210
And I keep getting matches on:
address1
: 45 Myrtle Street
siteName
: Myrtle
zipCode
: 90210
Here are some attempts that have not worked:
{
"query": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"term": {
"address1": {
"value": "45 myrtle street"
}
}
},
{
"term": {
"siteName": {
"value": "myrtle"
}
}
}
]
}
},
{
"term": {
"zipCode": {
"value": "90210"
}
}
}
]
}
}
}
{
"query": {
"filtered": {
"query": {
"term": {
"zipCode": {
"value": "90210"
}
}
},
"filter": {
"or": {
"filters": [
{
"term": {
"address1": "45 myrtle street"
}
},
{
"term": {
"siteName": "myrtle"
}
}
]
}
}
}
}
}
{
"filter": {
"bool": {
"must": [
{
"or": {
"filters": [
{
"term": {
"address1": "45 myrtle street"
}
},
{
"term": {
"siteName": "myrtle"
}
}
]
}
},
{
"term": {
"zipCode": "90210"
}
}
]
}
}
}
{
"query": {
"bool": {
"must": [
{
"span_or": {
"clauses": [
{
"span_term": {
"siteName": {
"value": "myrtle"
}
}
}
]
}
},
{
"term": {
"zipCode": {
"value": "90210"
}
}
}
]
}
}
}
{
"query": {
"filtered": {
"query": {
"term": {
"zipCode": {
"value": "90210"
}
}
},
"filter": {
"or": {
"filters": [
{
"term": {
"address1": "45 myrtle street"
}
},
{
"term": {
"siteName": "myrtle"
}
}
]
}
}
}
}
}
Does anyone know what I am doing wrong?
I'm writing this with NEST, so I would prefer NEST syntax, but ElasticSearch syntax would certainly suffice as well.
EDIT: Per @Greg Marzouka's comment, here are the mappings:
{
[indexname]: {
"mappings": {
"[indexname]elasticsearchresponse": {
"properties": {
"address": {
"type": "string"
},
"address1": {
"type": "string"
},
"address2": {
"type": "string"
},
"address3": {
"type": "string"
},
"city": {
"type": "string"
},
"country": {
"type": "string"
},
"id": {
"type": "string"
},
"originalSourceId": {
"type": "string"
},
"placeId": {
"type": "string"
},
"siteName": {
"type": "string"
},
"siteType": {
"type": "string"
},
"state": {
"type": "string"
},
"systemId": {
"type": "long"
},
"zipCode": {
"type": "string"
}
}
}
}
}
}
Upvotes: 1
Views: 494
Reputation: 3325
Based on your mapping, you won't be able to search for exact matches on siteName
because it's being analyzed with the standard analyzer, which is more tuned for full text search. This is the default analyzer that is applied by Elasticsearch when one isn't explicitly defined on a field.
The standard analyzer is breaking up the value of siteName
into multiple tokens. For example, Myrtle Street
is tokenized and stored as two separate terms in your index, myrtle
and street
, which is why your query is matching that document. For a case-insensitive exact match, instead you want Myrtle Street
stored as a single, lower-cased token in your index: myrtle street
.
You could set siteName
to not_analyzed
, which won't subject the field to the analysis chain at all- meaning the values will not be modified. However, this will produce a single Myrtle Street
token, which will work for exact matches, but will be case-sensitive.
What you need to do is create a custom analyzer using the keyword tokenizer and lowercase token filter, then apply it to your field.
Here's how you can accomplish this with NEST's fluent API:
// Create the custom analyzer using the keyword tokenizer and lowercase token filter
var myAnalyzer = new CustomAnalyzer
{
Tokenizer = "keyword",
Filter = new [] { "lowercase" }
};
var response = this.Client.CreateIndex("your-index-name", c => c
// Add the customer analyzer to your index settings
.Analysis(an => an
.Analyzers(az => az
.Add("my_analyzer", myAnalyzer)
)
)
// Create the mapping for your type and apply "my_analyzer" to the siteName field
.AddMapping<YourType>(m => m
.MapFromAttributes()
.Properties(ps => ps
.String(s => s.Name(t => t.SiteName).Analyzer("my_analyzer"))
)
)
);
Upvotes: 3