user3155208
user3155208

Reputation: 301

Elasticsearch - what to do if fields have the same name but multiple mapping

I use Elasticsearch for storing data sent from multiple sources outside of my system, i.e. I'm not controlling the incoming data - I just receive json document and store it. I have no logstash with its filters in the middle, only ES and Kibana. Each data source sent its own data type and all of them are stored in the same index (per tenant) but in different types. However since I cannot control the data that is sent to me, it is possible to receive documents of different types with the field having the same name and different structure.
For example, assume that I have type1 and type2 with field FLD, which is an object in both cases but the structure of this object is not the same. Specifically FLD.name is a string field in type1 but an object in type2. And in this case, when type1 data arrives it is stored successfully but when type2 data arrives, it is rejected:

failed to put mappings on indices [[myindex]], type [type2]
java.lang.IllegalArgumentException: Mapper for [FLD] conflicts with existing mapping in other types[Can't merge a non object mapping [FLD.name] with an object mapping [FLD.name]]

ES documentation clearly declare that fields with the same name in the same index in different mapping types mapped to the same field internally and must have the same mapping (see here).

My question is what can I do in this case? I'd prefer to keep all the types in the same index. Is it possible to add a unique-per-type suffix to field names or something like this? Any other solution? I'm a newbie in Elasticsearch so maybe I'm missing something simple... Thanks in advance.

Upvotes: 12

Views: 6058

Answers (2)

Chris Wendt
Chris Wendt

Reputation: 310

There is no way to do index arbitrary JSON without pre-processing before it's indexed - not even Dynamic templates are flexible enough.

You can flatten nested objects into key-value pairs and use a Nested datatype, Multi-fields, and ignore_malformed to index arbitrary JSON (even with type conflicts) as described here. Unfortunately, Elasticsearch can still throw an exception at query time if you try to, for example, match a string to kv_pairs.value.long, so you'll have choose appropriate fields based on format of the value.

Upvotes: 2

Dherik
Dherik

Reputation: 19060

It's not the best practice I suppose, but you can store the field content as a String and make the deserialization manually after retrieve the information.

So, imagine a class like:

class Person {
    private final Object name;
}

That can receive a List of String or a List of any other Object, just for example

So, instead of serialize the Person to a String and save it, you can serialize to a String and save the content on another class, like:

String personContent new ObjectMapper().writeValueAsString(person);
RequestDto dto = new RequestDto(personContent);
String dtoContent new ObjectMapper().writeValueAsString(dto);

And save the dtoContent:

IndexRequest request = new IndexRequest("persons")
request.source(dtoContent, XContentType.JSON);
IndexResponse response = client.index(request, RequestOptions.DEFAULT);

The RequestDto will be a simple class with a String field:

class RequestDto {
    private String content;
}

I'm not a expert on ElasticSearch, but probably you will loose a lot of features from the ElasticSearch by passing his validations doing that.

Upvotes: 0

Related Questions