Reputation: 2209
I m trying to index some data in ES and I m receiving out of memory exception:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at org.elasticsearch.common.jackson.core.util.BufferRecycler.balloc(BufferRecycler.java:155)
at org.elasticsearch.common.jackson.core.util.BufferRecycler.allocByteBuffer(BufferRecycler.java:96)
at org.elasticsearch.common.jackson.core.util.BufferRecycler.allocByteBuffer(BufferRecycler.java:86)
at org.elasticsearch.common.jackson.core.io.IOContext.allocWriteEncodingBuffer(IOContext.java:152)
at org.elasticsearch.common.jackson.core.json.UTF8JsonGenerator.<init>(UTF8JsonGenerator.java:123)
at org.elasticsearch.common.jackson.core.JsonFactory._createUTF8Generator(JsonFactory.java:1284)
at org.elasticsearch.common.jackson.core.JsonFactory.createGenerator(JsonFactory.java:1016)
at org.elasticsearch.common.xcontent.json.JsonXContent.createGenerator(JsonXContent.java:68)
at org.elasticsearch.common.xcontent.XContentBuilder.<init>(XContentBuilder.java:96)
at org.elasticsearch.common.xcontent.XContentBuilder.builder(XContentBuilder.java:77)
at org.elasticsearch.common.xcontent.json.JsonXContent.contentBuilder(JsonXContent.java:38)
at org.elasticsearch.common.xcontent.XContentFactory.contentBuilder(XContentFactory.java:122)
at org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder(XContentFactory.java:49)
at EsController.importProductEs(EsController.java:60)
at Parser.fromCsvToJson(Parser.java:120)
at CsvToJsonParser.parseProductFeeds(CsvToJsonParser.java:43)
at MainParser.main(MainParser.java:49)
This is how I instantiate the ES client:
System.out.println("Elastic search client is instantiated");
Settings settings = ImmutableSettings.settingsBuilder().put("cluster.name", "elasticsearch_brew").build();
client = new TransportClient(settings);
String hostname = "localhost";
int port = 9300;
((TransportClient) client).addTransportAddress(new InetSocketTransportAddress(hostname, port));
bulkRequest = client.prepareBulk();
and then I run the bulk request:
// for each product in the list, we need to include the fields in the bulk request
for(HashMap<String, String> productfields : products)
try {
bulkRequest.add(client.prepareIndex(index,type,productfields.get("Product_Id"))
.setSource(jsonBuilder()
.startObject()
.field("Name",productfields.get("Name") )
.field("Quantity",productfields.get("Quantity"))
.field("Make", productfields.get("Make"))
.field("Price", productfields.get("Price"))
.endObject()
)
);
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
//execute the bulk request
BulkResponse bulkResponse = bulkRequest.execute().actionGet();
if (bulkResponse.hasFailures()) {
// process failures by iterating through each bulk response item
}
I am trying to index products from various shops. Each shop is a different index. When I reach the 6th shop containing around 60000 products I get the above exception. I split the bulk request in chunks of 10000, trying to avoid the out of memory problems. I can't understand where exactly is the bottleneck. Would it help if i somehow flush the bulk request or restart the client?? I ve seen similar posts but non works for me.
EDIT
When I m instantiting a new client every time I process a new bulk request, then I don't get the out of memory exception. But instantiating a new client each time doesnt seem right..
Thank you
Upvotes: 1
Views: 2928
Reputation: 2209
So I figured out what was wrong.
Every new bulk request was adding up to the previous one and eventually it was leading to out of memory.
So now before I start a new bulk request I run the
bulkRequest = client.prepareBulk();
which flushes the previous request.
Thank you guys for your comments
Upvotes: 2