Jérémy
Jérémy

Reputation: 405

Elasticsearch query with an array as search input

I'm trying to query some indexed data with an array of strings as search input.

The indexed data looks like this:

{
  "pubMedID": "21528671",
  "title": "Basic fibroblast [...] melanoma cells.",
  "abstract": "Human malignant [...] cell growth."
}

I would like to search within the 'title' and 'abstract' fields for multiple strings. For example:

queryString=['melanoma', 'dysplastic nevus syndrome']

I already tried with the following code:

queryString=['melanoma', 'dysplastic nevus syndrome']

payload={
  "query": {
    "bool": {
      "should": [
        {
          "query_string": {
            "query": queryString,
            "fields": [
              "title",
              "abstract"
            ]
          }
        }
      ]
    }
  }
}


payload_json = (json.dumps(payload))
res = esclient.search(index='medicine',body=payload_json)

But I get the following error when running this:

RequestError: RequestError(400, 'parsing_exception', '[query_string] query does not support [query]')

The query does work fine if I just put in a simple string value. Can someone tell me how I should do this kind of queries where you give as an input an array? Thank you in advance!

Upvotes: 0

Views: 1343

Answers (1)

bryan60
bryan60

Reputation: 29335

EDIT:

I was a bit unfamiliar with the query_string query, but it turns out you can do something like this with it too:

qs = ''
for q in queryStrings:
  if qs:
    qs += ' OR '
  qs += q

payload={
  "query": {
    "bool": {
      "should": [
        {
          "query_string": {
            "query": qs,
            "fields": [
              "title",
              "abstract"
            ]
          }
        }
      ]
    }
  }
}

the result will be a query similar to the multiple clause one's outlined below.

docs here: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html

ORIGINAL:

this can be achieved with multiple clauses like so:

queryString=['melanoma', 'dysplastic nevus syndrome']

payload={
  "query": {
    "bool": {
      "should": [
        {
          "query_string": {
            "query": queryString[0],
            "fields": [
              "title",
              "abstract"
            ]
          }
        },
        {
          "query_string": {
            "query": queryString[1],
            "fields": [
              "title",
              "abstract"
            ]
          }
        }
      ]
    }
  }
}

If you have a variable number of queries, then you just need to dynamically build your "should" clauses like:

shoulds = []
for q in queryStrings:
   shoulds.append({
      "query_string": {
        "query": q,
        "fields": [
          "title",
          "abstract"
        ]
      }
    })

payload={
  "query": {
    "bool": {
      "should": shoulds
    }
  }
}

Upvotes: 1

Related Questions