peetonn
peetonn

Reputation: 3052

How to index list of object in Elasticsearch?

A document format I ingest into ElasticSearch looks like this:

{
   'id':'514d4e9f-09e7-4f13-b6c9-a0aa9b4f37a0'
   'created':'2019-09-06 06:09:33.044433',
   'meta':{
      'userTags':[
         {
            'intensity':'1',
            'sentiment':'0.84',
            'keyword':'train'
         },
         {
            'intensity':'1',
            'sentiment':'-0.76',
            'keyword':'amtrak'
         }
      ]
   }
}

...ingested with python:

r = requests.put(itemUrl, auth = authObj, json = document, headers = headers)

The idea here is that ElasticSearch will treat keyword, intensity and sentiment as fields that can be later queried. However, on ElasticSearch side I can observe that this is not happening (I use Kibana for search UI) -- instead, I see field "meta.userTags" with the value that is the whole list of objects.

How can I make ElasticSearch index elements within a list?

Upvotes: 2

Views: 4920

Answers (2)

zbusia
zbusia

Reputation: 601

You don't need a special mapping to index a list - every field can contain one or more values of the same type. See array datatype.

In the case of a list of objects, they can be indexed as object or nested datatype. Per default elastic uses object datatype. In this case you can query meta.userTags.keyword or/and meta.userTags.sentiment. The result will allways contains whole documents with values matched independently, ie. searching keyword=train and sentiment=-0.76 you WILL find document with id=514d4e9f-09e7-4f13-b6c9-a0aa9b4f37a0.

If this is not what you want, you need to define nested datatype mapping for field userTags and use a nested query.

Upvotes: 1

soumitra goswami
soumitra goswami

Reputation: 891

I used the document body you provided to create a new index 'testind' and type 'testTyp' using the Postman REST client.:

POST http://localhost:9200/testind/testTyp
{
   "id":"514d4e9f-09e7-4f13-b6c9-a0aa9b4f37a0",
   "created":"2019-09-06 06:09:33.044433",
   "meta":{
      "userTags":[
         {
            "intensity":"1",
            "sentiment":"0.84",
            "keyword":"train"
         },
         {
            "intensity":"1",
            "sentiment":"-0.76",
            "keyword":"amtrak"
         }
      ]
   }
}

When I queried for the index's mapping this is what i get :

GET http://localhost:9200/testind/testTyp/_mapping
{  
  "testind":{  
    "mappings":{  
      "testTyp":{  
        "properties":{  
          "created":{  
            "type":"text",
            "fields":{  
             "keyword":{  
                "type":"keyword",
                "ignore_above":256
              }
            }
          },
          "id":{  
            "type":"text",
            "fields":{  
              "keyword":{  
                "type":"keyword",
                "ignore_above":256
              }
            }
          },
          "meta":{  
            "properties":{  
              "userTags":{  
                "properties":{  
                  "intensity":{  
                    "type":"text",
                    "fields":{  
                      "keyword":{  
                        "type":"keyword",
                        "ignore_above":256
                      }
                    }
                  },
                  "keyword":{  
                    "type":"text",
                    "fields":{  
                      "keyword":{  
                        "type":"keyword",
                        "ignore_above":256
                      }
                    }
                  },
                  "sentiment":{  
                    "type":"text",
                    "fields":{  
                      "keyword":{  
                        "type":"keyword",
                        "ignore_above":256
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

As you can see in the mapping the fields are part of the mapping and can be queried as per need in future, so I don't see the problem here as long as the field names are not one of these - https://www.elastic.co/guide/en/elasticsearch/reference/6.4/sql-syntax-reserved.html ( you might want to avoid the term 'keyword' as it might be confusing later when writing search queries as the fieldname and type are both same - 'keyword') . Also, note one thing, the mapping gets created via dynamic mapping (https://www.elastic.co/guide/en/elasticsearch/reference/6.3/dynamic-field-mapping.html#dynamic-field-mapping ) in Elasticsearch and so the data types are determined by elasticsearch based on the values you have provided.However, this may not be always accurate , so to prevent that you can use the PUT _mapping API to define your own mapping for the index, and then prevent new fields within a type from being added to mappings.

Upvotes: 1

Related Questions