rcc
rcc

Reputation: 5

elasticsearch 7 and mapping and sorting

I struggle a lot with achieving sorting in ES 7. I always seem to receive Fielddata is disabled on text fields by default. Set fielddata=true on ... I also created a mapping, where I turn fielddata on, but somehow, it brings no change.

{
    "index": "orderoverview",
    "body": {
        "mappings": {
            "properties": {
                "CUSTREFNR": {
                    "type": "text",
                    "fielddata": true
                },
                "ORDERNR_EVO": {
                    "type": "text",
                    "fielddata": true
                },
            }
        }
    }
}

I must say, the removal of types in ES 7 is not very clear to me. By indexing, I must provide a type:

'index' => getIndexName(),
    'type' => 'Orderoverview',
    'body' => []

Here, if I provide my type I get the error: Rejecting mapping update to [index] as the final mapping would have more than 1 type: [_doc, Orderoverview]"}]

So I index them with type _doc, but then the searches don't work anymore.

"index": "orderoverview",
    "type": "_doc",
    "body": {
        "from": 0,
        "size": 20,
        "query": {
            "bool": {
                "should": [
                    {
                        "term": {
                            "CUSTNR": "24508"
                        }
                    }
                ]
            }
        }
    }
}

ES is great, but sometimes feels like black magic.

Upvotes: 0

Views: 1048

Answers (1)

Kamal Kunjapur
Kamal Kunjapur

Reputation: 8840

Any modification to existing fields it is always advisable to delete the index and re-ingest the document once again. This may not be the case when you add new fields though as mentioned by @Amit in the comment.

Another point is its better to make use of keyword field instead of setting fielddata to true for text fields as mentioned in this link.

Secondly in your question, you've specified different fields in mapping (CUSTREFNR, ORDERNR_EVO) and you are querying on different field(CUSTNR). Make sure you notice on little things.

Below is the sample mapping, documents, query request and response.

Mapping:

Instead of enabling fielddata=true, create multi-field. You can read about them in the link mentioned. Below is how your mapping would be:

PUT orderoverview
{
  "mappings": {
    "properties": {
      "CUSTREFNR": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword"
          }
        }
      },
      "ORDERNR_EVO": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword"
          }
        }
      }
    }
  }
}

Notice that for both the fields I've ended up creating its sibling keyword field.

You can read about text and keyword datatypes in the respective links.

Just to make a note, use keyword for aggregation and sorting functionalities, and text for typical searching.

Again when it comes to exact matches leverage keyword using Term Query.

On another note, you should not specify type in your mapping in ES 7. From this link, we have the below note:

Specifying types in requests is deprecated. For instance, indexing a document no longer requires a document type. The new index APIs are PUT {index}/_doc/{id} in case of explicit ids and POST {index}/_doc for auto-generated ids. Note that in 7.0, _doc is a permanent part of the path, and represents the endpoint name rather than the document type.

This means, the only acceptable value during ingestion would be _doc. Anything other than that, your ingestion would fail.

Sample Documents:

POST orderoverview/_doc/1
{
  "CUSTREFNR": "24508",
  "ORDERNR_EVO": "A1B1"
}

POST orderoverview/_doc/2
{
  "CUSTREFNR": "24508",
  "ORDERNR_EVO": "A1B2"
}

POST orderoverview/_doc/3
{
  "CUSTREFNR": "24508",
  "ORDERNR_EVO": "A2B2"
}

POST orderoverview/_doc/4
{
  "CUSTREFNR": "24509",
  "ORDERNR_EVO": "A1B1"
}

Sample Query:

Let's say you have a use case where you'd want to select all documents having 24508 as CUSTREFNR however you would want to sort them by ORDERNR_EVO, then below is how your query would be:

POST orderoverview/_search
{
  "query": {
    "match": {
      "CUSTREFNR": "24508"
    }
  },
  "sort": [
    {
      "ORDERNR_EVO.keyword": {
        "order": "desc"
      }
    }
  ]
}

Response:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
        "_index" : "orderoverview",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : null,
        "_source" : {
          "CUSTREFNR" : "24508",
          "ORDERNR_EVO" : "A2B2"
        },
        "sort" : [
          "A2B2"
        ]
      },
      {
        "_index" : "orderoverview",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : null,
        "_source" : {
          "CUSTREFNR" : "24508",
          "ORDERNR_EVO" : "A1B2"
        },
        "sort" : [
          "A1B2"
        ]
      },
      {
        "_index" : "orderoverview",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : null,
        "_source" : {
          "CUSTREFNR" : "24508",
          "ORDERNR_EVO" : "A1B1"
        },
        "sort" : [
          "A1B1"
        ]
      }
    ]
  }
}

Hope this helps!

Upvotes: 2

Related Questions