Reputation: 75

What is the Best way to insert Entries into ElasticSearch?

I am new to ElasticSearch and I have a file of 180 fields and 12 million lines. I have created an index and type in ElasticSearch and Java Program but it takes 1.5 hrs. Is there any other best way to to load data into ElasticSearch with reduced time. I have tried a map reduce program but some times it fails and generates duplicate entries and take more time than time my sequential program.

Can anybody give good suggestions ?

Upvotes: 3

Answers (1)

Sachin

Reputation: 1715

You may disable speculative execution when using ES-hadoop plugin to avoid duplicate entries. Try to fine tune the batch size of bulk api when using map-reduce to index the data. For more information please refer :-https://www.elastic.co/guide/en/elasticsearch/hadoop/current/configuration.html and try changing the defaults to attain best performance. Also try to increase ES heap size. Also you can use apache Tika or mapper attachments plugin of ES to extract out information from file.

Hope it helps!

Upvotes: 0

What is the Best way to insert Entries into ElasticSearch?

Answers (1)

Related Questions