Reputation: 188
I am using PutElasticsearch5 processor to index documents into ES.My workflow has couple of other processors before PutElasticsearch5 which converts avro to json.
I am getting the below given error when I run the workflow.
java.lang.IllegalArgumentException: Validation Failed: 1: content type is missing;2: content type is missing;
I coudlnt find any other relvant information to troubleshoot this. There is no setting for "Content Type" under Putelasticsearch5 configuration
Upvotes: 2
Views: 694
Reputation: 53
I'm also having this issue, like user2297083 said if you are sending a batched JSON file into the PutElasticsearch5 then it will throw this exception and move the file into the FAILED relationship. The processor seems like it only handles one JSON object written into a file at a time that cannot be surrounded by array brackets. So if you have a file with content such as:
[{"key":"value"}]
then the processor will fail however if you send the same document as:
{"key":"value"}
then the processor will index successfully, considering your other configurations are correct.
One solution can be such that if you don't want to send everything through a splitter before coming to the PutElasticsearch5 processor, then use a splitter processor that works off of the FAILURE relationship to the PutElasticsearch5 and sends data back into the same PutElasticsearch5. More FlowFiles means more IO in your node, so I'm actively looking for a way to have the PutElasticsearch5 processor handle a batched JSON document. I feel like there's got to be a way without writing a custom iteration of it or creating a ton of new FlowFiles.
EDIT: Actually, it does answer the question. His question is:
I am using PutElasticsearch5 processor to index documents into ES.My workflow has couple of other processors before PutElasticsearch5 which converts avro to json.
I am getting the below given error when I run the workflow.
java.lang.IllegalArgumentException: Validation Failed: 1: content type is missing;2: content type is missing;
which is exactly the exception message that is given by the PutElasticsearch5 processor when passing a JSON file that is not formatted correctly. His question is why this is happening.
My answer states why it's happening (one possible use case) and how to work around it by giving a solution that does work.
In this case, correctly formatted JSON means a FlowFile that has a single JSON object as it's content as I have shown above.
Further looking into this though, it makes sense that the processor only takes a single JSON document FlowFile at a time because you can use FlowFile attributes to specify the "id" of the indexed document. If the uuid of the FlowFile is used and it was a batched JSON i.e.
[{"one":1},{"two":2},{"three":3}]
then each JSON object would be indexed in elasticsearch using the same "index","type", and "id" (id being the FlowFile uuid), and this would not be desired.
Upvotes: 1