johnyka
johnyka

Reputation: 409

Is it a good idea to store an ID in the _type field of an ElasticSearch index?

I just started a home project and i'm planning to use elastic as the database. I'm currently in the design phase and started to think on this one.

So let's say i've got articles that belong to different people. The Person object has got an ID and also the Article object has got an ID property. Obviously there's gonna be an index which holds Article documents. It seems to be a good idea to use the _type field of these documents to store an ID of a Person that means which Person the Article belongs to. However i've never seen anyone to use this field for something like this.

Is it faster to search in metadata than in _source data? I mean if I don't use the _type to store an ID the Article object would have an OwnerID field or something like that.

For an actual example let's say i want to look for all the articles that is about politics and written by XY in any order.

first version(note that XY is in the header):

GET /my_index/XY/_search
{
    "query" : {
        "constant_score" : { 
            "filter" : {
                "term" : { 
                    "genre" : "politics"
                }
            }
        }
    }
}

second version:

GET /my_index/article/_search
{
   "query" : {
      "constant_score" : { 
         "filter" : {
            "bool" : {
              "must" : [
                 { "term" : {"ownerID" : XY}}, 
                 { "term" : {"genre" : "politics"}} 
              ]
           }
         }
      }
   }
}

Is any of them better than the other one? I'm optimistic and I want to make a good design even if 5 people is going to use this site and even if 5000. Does it matter if i have 5000 different types in an index?

Upvotes: 0

Views: 280

Answers (1)

Val
Val

Reputation: 217314

Yes it does matter and that's why the second version is the way to go.

If you decide to use the person ID as the type of your articles and you have 5000 people, then your my_index index will end up with 5000 mapping types in it, all with the same fields. If at some point you wanted to add a new field to your articles, you'd have to modify all 5000 mapping types. That's probably why you've never seen anyone using types like this.

It is much more straightforward to have one index and one mapping type for articles and then an ownerID field as in your second version.

Upvotes: 2

Related Questions