Leo
Leo

Reputation: 1760

What does the "Type" mean in Elasticsearch?

I am totally confused by Elasticsearch's documents.

In Basic Concepts: Type, "type" are somehow like collections in MongoDB:

In this index, you may define a type for user data, another type for blog data, and yet another type for comments data.

But in Types and Mappings: Type Takeaways, it says:

Types are not as well suited for entirely different types of data. If your two types have mutually exclusive sets of fields, that means half your index is going to contain "empty" values (the fields will be sparse), which will eventually cause performance problems.

Doesn't "user" and "blog" above mentioned have mutually exclusive sets of fields? For example: there are "name", "age" fields for "user", and "createdAt", "content" for "blog".

I'm used to believe the mapping relation between Elasticsearch and MongoDB is:

index <=> database

type <=> collection

isn't it right? If not, what is the recommended mapping style between them?

Upvotes: 6

Views: 5864

Answers (2)

Andrei Stefan
Andrei Stefan

Reputation: 52368

Types are not as well suited for entirely different types of data. If your two types have mutually exclusive sets of fields, that means half your index is going to contain "empty" values (the fields will be sparse), which will eventually cause performance problems.

The type is just another field in Elasticsearch, at the very basic level. When you do GET /my_index/my_type/_search ES will run a pre-filter for my_type value for field _type - it's like an automatic filter.

Don't think about indices and types as databases and tables in SQL world, because they are not that.

If you have type1 with fields f1 and f2 and type2 with fields f1 and f3 in the index there will be documents with fields f1, f2, f3. Why this matters - when the score for a document will be calculated with queries that search for values in field f1 the terms frequencies in field f1 will be global (both type1 and type2) so if you search some value in f1 from type1 then the score you get back is slightly influenced, also, by the values of f1 in type2.

Also, please, don't translate a set of SQL tables to ES by simply following the primary key/foreign key approach to define parent/child relationships in ES.

Upvotes: 7

ygh
ygh

Reputation: 595

You're right, index == database and type == collection for elasticsearch. In RDBMS terms, index is a database and type can be a table which contains many rows(document in elasticsearch).

You could have a different index maintaining user information, with the "name", "age" and other such fields generally attributed to a person, and a different one for blogs with "createdAt", "content", etc. Yet, you might want to have a "user" field inside each blog document to be able to identify the person who posted it. Later, you can apply application-side joins, if need be.

Upvotes: -1

Related Questions