Roman Puchkovskiy
Roman Puchkovskiy

Reputation: 11865

Search by exact match in all fields in Elasticsearch

Let's say I have 3 documents, each of them only contains one field (but let's imagine that there are more, and we need to search through all fields).

  1. Field value is "first second"
  2. Field value is "second first"
  3. Field value is "first second third"

Here is a script that can be used to create these 3 documents:

# drop the index completely, use with care!
curl -iX DELETE 'http://localhost:9200/test'

curl -H 'content-type: application/json' -iX PUT 'http://localhost:9200/test/_doc/one' -d '{"name":"first second"}'
curl -H 'content-type: application/json' -iX PUT 'http://localhost:9200/test/_doc/two' -d '{"name":"second first"}'
curl -H 'content-type: application/json' -iX PUT 'http://localhost:9200/test/_doc/three' -d '{"name":"first second third"}'

I need to find the only document (document 1) that has exactly "first second" text in one of its fields.

Here is what I tried.

A. Plain search:

curl -H 'Content-Type: application/json' -iX POST 'http://localhost:9200/test/_search' -d '{
  "query": {
    "query_string": {
      "query": "first second"
    }
  }
}'

returns all 3 documents

B. Quoting

curl -H 'Content-Type: application/json' -iX POST 'http://localhost:9200/test/_search' -d '{
  "query": {
    "query_string": {
      "query": "\"first second\""
    }
  }
}'

gives 2 documents: 1 and 3, because both contain 'first second'.

Here https://stackoverflow.com/a/28024714/7637120 they suggest to use 'keyword' analyzer to analyze the fields when indexing, but I would like to avoid any customizations to the mapping.

Is it possible to avoid them and still only find document 1?

Upvotes: 1

Views: 4569

Answers (2)

Roman Puchkovskiy
Roman Puchkovskiy

Reputation: 11865

In Elasticsearch 7.1.0, it seems that you can use keyword analyzer even without creating a special mapping. At least I didn't, and the following query does what I need:

curl -H 'Content-Type: application/json' -iX POST 'http://localhost:9200/test/_search' -d '{
  "query": {
    "query_string": {
      "query": "first second",
      "analyzer": "keyword"
    }
  }
}'

Upvotes: 1

JBone
JBone

Reputation: 1836

Yes, you can do that by declaring name mapping type as keyword. The key to solve your problem is just simple -- declare name mapping type:keyword and off you go

to demonstrate it, I have done these

1) created mapping with `keyword` for `name` field`
2) indexed the three documents
3) searched with a `match` query

mappings

PUT so_test16
{
  "mappings": {
    "_doc":{
      "properties":{
        "name": {
          "type": "keyword"

        }
      }
    }
  }
}

Indexing the documents

POST /so_test16/_doc
{
    "id": 1,
    "name": "first second"
}
POST /so_test16/_doc
{
    "id": 2,
    "name": "second first"
}

POST /so_test16/_doc
{
    "id": 3,
    "name": "first second third"
}

The query

GET /so_test16/_search
{
  "query": {
    "match": {"name": "first second"}
  }
}

and the result

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "so_test16",
        "_type" : "_doc",
        "_id" : "m1KXx2sB4TH56W1hdTF9",
        "_score" : 0.2876821,
        "_source" : {
          "id" : 1,
          "name" : "first second"
        }
      }
    ]
  }
}

Adding second solution ( if the name is not a keyword type but a text type. Only thing here is fielddata:true also needed to be added for name field)

Mappings

PUT so_test18
{

    "mappings" : {
      "_doc" : {
        "properties" : {
          "id" : {
            "type" : "long"
          },
          "name" : {
            "type" : "text",
            "fielddata": true
          }
        }
      }

  }
}

and the search query

GET /so_test18/_search
{
  "query": {
    "bool": {
      "must": [
        {"match_phrase": {"name": "first second"}}
      ],
      "filter": {

        "script": {
          "script": {
            "lang": "painless",
            "source": "doc['name'].values.length == 2"
          }
        }

      }
    }

  }
}

and the response

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.3971361,
    "hits" : [
      {
        "_index" : "so_test18",
        "_type" : "_doc",
        "_id" : "o1JryGsB4TH56W1hhzGT",
        "_score" : 0.3971361,
        "_source" : {
          "id" : 1,
          "name" : "first second"
        }
      }
    ]
  }
}

Upvotes: 1

Related Questions