Reputation: 108
I am running following code which will try to pull data from database and index it using elastic search. Data volumn in around 1 million records. But code breaks somewhere in middle and gives error as "no configured node available". Also, even if code runs without error it does not load entire data.
Settings settings = ImmutableSettings.settingsBuilder()
.put("cluster.name", "elasticsearch")
.build();
Client client = new TransportClient(settings)
.addTransportAddress(new InetSocketTransportAddress("localhost", 9300));
while(rs.next())
{
Map<String, Object> json = new HashMap<String, Object>();
json.put("id", rs.getLong("id"));
json.put("type",rs.getString("type"));
client.prepareIndex("test", "doc").setSource(json).execute();
}
Thanks for all the help in advance
Upvotes: 1
Views: 125
Reputation: 52368
Most probably, you are overloading the cluster. The nodes start to run out of memory/CPU and dies. Don't send so many/so large indexing requests to it. Definitely is not able to support that and you are reaching its limits. Or get a more powerful cluster.
Have a look here for details on how to size your chunk of messages:
The entire bulk request needs to be loaded into memory by the node that receives our request, so the bigger the request, the less memory available for other requests. There is an optimal size of bulk request. Above that size, performance no longer improves and may even drop off. The optimal size, however, is not a fixed number. It depends entirely on your hardware, your document size and complexity, and your indexing and search load.
Fortunately, it is easy to find this sweet spot: Try indexing typical documents in batches of increasing size. When performance starts to drop off, your batch size is too big. A good place to start is with batches of 1,000 to 5,000 documents or, if your documents are very large, with even smaller batches.
It is often useful to keep an eye on the physical size of your bulk requests. One thousand 1KB documents is very different from one thousand 1MB documents. A good bulk size to start playing with is around 5-15MB in size.
Upvotes: 1