GDorn
GDorn

Reputation: 8811

How to get elasticsearch to perform an exact match query?

This is a two-part question.

My documents look like this:

{"url": "https://someurl.com", 
 "content": "searchable content here", 
 "hash": "c54cc9cdd4a79ca10a891b8d1b7783c295455040", 
 "headings": "more searchable content", 
 "title": "Page Title"}

My first question is how to retrieve all documents where 'title' is exactly "No Title". I don't want a document with the title of "This Document Has No Title" to appear.

My second question is how to retrieve all documents where 'url' appears exactly in a long list of urls.

I'm using pyelasticsearch, but a generic answer in curl would also work.

Upvotes: 5

Views: 9257

Answers (3)

waruna k
waruna k

Reputation: 878

try this method. it's work.

import json
from elasticsearch import Elasticsearch
connection = Elasticsearch([{'host': host, 'port': port}])

elastic_query = json.dumps({
     "query": {
         "match_phrase": {
            "UserName": "name"
          }
      }
 })
result = connection.search(index="test_index", body=elastic_query)

Upvotes: 8

dadoonet
dadoonet

Reputation: 14492

You have to define a mapping for fields.

If you are looking for exact values (case sensitive), you can set index property to not_analyzed.

Something like :

"url" : {"type" : "string", "index" : "not_analyzed"}

Upvotes: 10

Radu Gheorghe
Radu Gheorghe

Reputation: 574

If you have your source stored (which is the default) you can use a script filter

It should go something like this:

$ curl -XPUT localhost:9200/index/type/1 -d '{"foo": "bar"}'
$ curl -XPUT localhost:9200/index/type/2 -d '{"foo": "bar baz"}'
$ curl -XPOST localhost:9200/index/type/_search?pretty=true -d '{
"filter": {
    "script": {
        "script": "_source.foo == \"bar\""
    }
}
}'
{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "index",
      "_type" : "type",
      "_id" : "1",
      "_score" : 1.0, "_source" : {"foo": "bar"}
    } ]
  }
}

EDIT: I think it's worth mentioning that the "not_analyzed" mapping should be the faster approach. But if you want both exact and partial matches for this field, I see two options: use scripts or index the data twice (once analyzed, once not analyzed).

Upvotes: 3

Related Questions