Archimedes Trajano
Archimedes Trajano

Reputation: 41300

Does the elasticsearch ID have to be unique to a type or to the index?

Elasticsearch allows you to store a _type along with the _index. I was wondering if I were to provide my own _id should it be unique across the index?

Upvotes: 5

Views: 6952

Answers (2)

Slam
Slam

Reputation: 8572

It should be unique together

PUT so
PUT /so/t1/1
{}
PUT /so/t2/1
{}
GET /so/_search

{
   "took": 1,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 2,
      "max_score": 1,
      "hits": [
         {
            "_index": "so",
            "_type": "t2",
            "_id": "1",
            "_score": 1,
            "_source": {}
         },
         {
            "_index": "so",
            "_type": "t1",
            "_id": "1",
            "_score": 1,
            "_source": {}
         }
      ]
   }
}

And the reason for that: you'd never get documents by index w/o knowing doctype, and querying ES with index-wide query will return documents including their types and indexes.

Upvotes: 9

IanGabes
IanGabes

Reputation: 2787

Absolutely, there are a few ways of doing it.

The first is using the PUT API, which allows us to specify an ID for a document. So, for the index index and the type type:

curl -XPUT "http://localhost:9200/index/type/1/" -d'
{
    "test":"test"
}

Which gives me this document:

{
    "_index": "index",
    "_type": "type",
    "_id": "1",
    "_score": 1,
    "_source": {
        "test": "test"
    }
}

Another way is to route the ID to a unique field in your mapping. For example, an md5 hash. So, for an index called index with a type called type, we can specify the following mapping:

curl -XPUT "http://localhost:9200/index/_mapping/type" -d'
{    
    "type": {
        "_id":{
            "path" : "md5"
        },
        "properties": {
           "md5": {
               "type":"string"
           }
        }
    }
}

This time, I'm going to use the POST API, which automatically generates an ID. If you haven't specified a path in your mapping, it will automatically generate one for you.

curl -XPOST "http://localhost:9200/index/type/" -d'
{
    "md5":"00000000000011111111222222223333"
}'

Which gives me the following document in a search:

{
   "_index": "index",
   "_type": "type",
   "_id": "00000000000011111111222222223333",
   "_score": 1,
   "_source": {
      "md5": "00000000000011111111222222223333"
   }
}

The second method is generally preferred, because it provides consistency across the index. A perfectly valid id for an index could be 1 like in the example, or dog in another case.

Upvotes: 0

Related Questions