Reputation: 625
I am using elasticsearch 5 version. They have set a limit on maximum no of fields in an index.
The maximum number of fields in an index. The default value is 1000.
I have a _type named 'customer' in the _index 'company', customer can have too many fileds (say 2000) in its doc.
We can achieve above requirement by changing in company setting as below
PUT company/_settings
{
"index.mapping.total_fields.limit": 2000
}
and then put the mapping of customer like:
PUT /company/_mapping/customer
{
"customer": {
"properties": {
"property1": {
"type": "text"
},
"property2": {
"type": "text"
},
.
.
.
"property2000": {
"type": "text"
}
}
}
}
Above solution leads to data sparsity problem, since each customer doesn't have all the properties.
We can create a separate _type for customer properties(say custom_props) with following mapping
PUT /company/_mapping/custom_props
{
"custom_props": {
"_parent": {
"type": "customer"
},
"_routing": {
"required": true
},
"properties": {
"property_name": {
"type": "text"
},
"property_value": {
"type": "text"
}
}
}
}
Now each property of customer will have a separate doc in custom_props.
When searching for a particular customer with certain properties we need to make has_child query and some time has_child query with inner_hits in some use cases. As per ES documentation these queries are much slower than simple search queries.
So I want a best alternative of solving this problem when we have too many fields in our elasticsearch _index.
Upvotes: 1
Views: 2572
Reputation: 6066
There is one type of handling relations in Elasticsearch that you didn't consider: nested objects. They are similar to parent/child but usually has better query performance.
With nested data type the mapping might look like this:
PUT /company/
{
"mappings": {
"customer": {
"properties": {
"commonProperty": {
"type": "text"
},
"customerSpecific": {
"type": "nested",
"properties": {
"property_name": {
"type": "keyword"
},
"property_value": {
"type": "text"
}
}
}
}
}
}
}
Let's see how will a document look like:
POST /company/customer/1
{
"commonProperty": "companyID1",
"customerSpecific": [
{
"property_name": "name",
"property_value": "John Doe"
},
{
"property_name": "address",
"property_value": "Simple Rd, 112"
}
]
}
POST /company/customer/2
{
"commonProperty": "companyID1",
"customerSpecific": [
{
"property_name": "name",
"property_value": "Jane Adams"
},
{
"property_name": "address",
"property_value": "42 St., 15"
}
]
}
To be able to query such data we will have to use a nested query. For instance, to find a customer with name "John"
we might use a query like this:
POST /company/customer/_search
{
"query": {
"nested": {
"path": "customerSpecific",
"query": {
"bool": {
"must": [
{
"term": {
"customerSpecific.property_name": "name"
}
},
{
"match": {
"customerSpecific.property_value": "John"
}
}
]
}
}
}
}
}
Hope that helps!
Upvotes: 2