Reputation: 905
I have bunch of documents like below. I want to filter the data where projectkey starts with ~. I did read some articles which says ~ is an operator in Elastic query so cannot really filter with that. Can someone help to form the search query for /branch/_search API ??
{
"_index": "branch",
"_type": "_doc",
"_id": "GAz-inQBJWWbwa_v-l9e",
"_version": 1,
"_score": null,
"_source": {
"branchID": "refs/heads/feature/12345",
"displayID": "feature/12345",
"date": "2020-09-14T05:03:20.137Z",
"projectKey": "~user",
"repoKey": "deploy",
"isDefaultBranch": false,
"eventStatus": "CREATED",
"user": "user"
},
"fields": {
"date": [
"2020-09-14T05:03:20.137Z"
]
},
"highlight": {
"projectKey": [
"~@kibana-highlighted-field@user@/kibana-highlighted-field@"
],
"projectKey.keyword": [
"@kibana-highlighted-field@~user@/kibana-highlighted-field@"
],
"user": [
"@kibana-highlighted-field@user@/kibana-highlighted-field@"
]
},
"sort": [
1600059800137
]
}
UPDATE***
I used prerana's answer below to use -prefix in my query
Something is still wrong when i use prefix and range - i get below error - What am i missing ??
GET /branch/_search
{
"query": {
"prefix": {
"projectKey": "~"
},
"range": {
"date": {
"gte": "2020-09-14",
"lte": "2020-09-14"
}
}
}
}
{
"error": {
"root_cause": [
{
"type": "parsing_exception",
"reason": "[prefix] malformed query, expected [END_OBJECT] but found [FIELD_NAME]",
"line": 6,
"col": 5
}
],
"type": "parsing_exception",
"reason": "[prefix] malformed query, expected [END_OBJECT] but found [FIELD_NAME]",
"line": 6,
"col": 5
},
"status": 400
}
Upvotes: 0
Views: 164
Reputation:
while @hansley answer would work, but it requires you to create a custom analyzer and still as you mentioned you want to get only the docs which starts with ~
but in his result I see all the docs containing ~
, so providing my answer which requires very less configuration and works as required.
Index mapping default, so just index below docs and ES will create a default mapping with .keyword
field for all text
field
Index sample docs
{
"title" : "content1 ~"
}
{
"title" : "~ staring with"
}
{
"title" : "in between ~ with"
}
Search query should fetch obly 2nd docs from sample docs
{
"query": {
"prefix" : { "title.keyword" : "~" }
}
}
And search result
"hits": [
{
"_index": "pre",
"_type": "_doc",
"_id": "2",
"_score": 1.0,
"_source": {
"title": "~ staring with"
}
}
]
Please refer prefix query for more info
Update 1:
Index Mapping:
{
"mappings": {
"properties": {
"date": {
"type": "date"
}
}
}
}
Index Data:
{
"date": "2015-02-01",
"title" : "in between ~ with"
}
{
"date": "2015-01-01",
"title": "content1 ~"
}
{
"date": "2015-02-01",
"title" : "~ staring with"
}
{
"date": "2015-02-01",
"title" : "~ in between with"
}
Search Query:
{
"query": {
"bool": {
"must": [
{
"prefix": {
"title.keyword": "~"
}
},
{
"range": {
"date": {
"lte": "2015-02-05",
"gte": "2015-01-11"
}
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "stof_63924930",
"_type": "_doc",
"_id": "2",
"_score": 2.0,
"_source": {
"date": "2015-02-01",
"title": "~ staring with"
}
},
{
"_index": "stof_63924930",
"_type": "_doc",
"_id": "4",
"_score": 2.0,
"_source": {
"date": "2015-02-01",
"title": "~ in between with"
}
}
]
Upvotes: 1
Reputation: 290
If I understood your issue well, I suggest the creation of a custom analyzer to search the special character ~
.
I did a test locally as follows while replacing ~
to __SPECIAL__
:
I created an index with a custom char_filter
alongside with the addition of a field to the projectKey
field. The name of the new multi_field is special_characters
.
Here is the mapping:
PUT wildcard-index
{
"settings": {
"analysis": {
"char_filter": {
"special-characters-replacement": {
"type": "mapping",
"mappings": [
"~ => __SPECIAL__"
]
}
},
"analyzer": {
"special-characters-analyzer": {
"tokenizer": "standard",
"char_filter": [
"special-characters-replacement"
]
}
}
}
},
"mappings": {
"properties": {
"projectKey": {
"type": "text",
"fields": {
"special_characters": {
"type": "text",
"analyzer": "special-characters-analyzer"
}
}
}
}
}
}
Then I ingested the following contents in the index:
"projectKey": "content1 ~"
"projectKey": "This ~ is a content"
"projectKey": "~ cars on the road"
"projectKey": "o ~ngram"
Then, the query was:
GET wildcard-index/_search
{
"query": {
"match": {
"projectKey.special_characters": "~"
}
}
}
The response was:
"hits" : [
{
"_index" : "wildcard-index",
"_type" : "_doc",
"_id" : "h1hKmHQBowpsxTkFD9IR",
"_score" : 0.43250346,
"_source" : {
"projectKey" : "content1 ~"
}
},
{
"_index" : "wildcard-index",
"_type" : "_doc",
"_id" : "iFhKmHQBowpsxTkFFNL5",
"_score" : 0.3034693,
"_source" : {
"projectKey" : "This ~ is a content"
}
},
{
"_index" : "wildcard-index",
"_type" : "_doc",
"_id" : "-lhKmHQBowpsxTkFG9Kg",
"_score" : 0.3034693,
"_source" : {
"projectKey" : "~ cars on the road"
}
}
]
Please let me know If you have any issue, I will be glad to help you.
Note: This method works if there is a blank space after the ~
. You can see from the response that the 4th data was not displayed.
Upvotes: 2