timtofan
timtofan

Reputation: 104

Is having empty fields bad for lucene index?

ES doc on mappings states below

Types are not as well suited for entirely different types of data. If your two types have mutually exclusive sets of fields, that means half your index is going to contain "empty" values (the fields will be sparse), which will eventually cause performance problems. In these cases, it’s much better to utilize two independent indices.

I'm wondering how strictly should I take this.

Say I have three types of documents, with each sharing same 60-70% of fields and the rest being unique to each type.

Should I put each type in a separate index? Or one single index would be fine as well, meaning there won't be lots of storage waste or any noticeable performance hit on search or index operations?

Basically I'm looking for any information to either confirm or disprove the quote above.

Upvotes: 0

Views: 173

Answers (1)

Persimmonium
Persimmonium

Reputation: 15791

If your types overlap 60-70% then ES will be fine, that does not sound 'mutually exclusive' at all. Notice that:

  1. Things will improve in future versions of ES
  2. If you don't need them, you can disable norms and doc_values, as recommended here

Upvotes: 1

Related Questions