Reputation: 119
I am trying to insert 600MB Json file (which may be enlarge in the future) to elasticsearch. However, I get below error,
Error: "toString()" failed
I am using stream-json npm but no luck :( What is the best way to do this? I am thinking to chunk out, but if there's a better way, that'll be great
var makeBulk = function(csList, callback){
const pipeline = fs.createReadStream('./CombinedServices_IBC.json').pipe(StreamValues.withParser());
while()
pipeline.on('data', data => {
for(var index in data.value.features){
bulk.push(
{ index: {_index: 'combinedservices1', _type: '_doc', _id: data.value.features[index].properties.OBJECTID } },
{
'geometry': data.value.features[index].geometry,
'properties': data.value.features[index].properties
}
);
}
callback(bulk);
});
}
Upvotes: 3
Views: 3171
Reputation: 134
Don't insert a bulk of 600MB, default bulk queue can keep up to 200 bulks inn JVM Heap Space - imagine if each is 600MB, what you will get is OOM and GC problems
Refer to https://www.elastic.co/guide/en/elasticsearch/guide/current/bulk.html#_how_big_is_too_big ; example logstash elasticsearch output plugin sends bulk of up to 20Mb
Upvotes: 0
Reputation: 1104
There is a tool for such use case Elasticdump( https://github.com/taskrabbit/elasticsearch-dump)
Installation of elasticsearch-dump
npm install elasticdump -g
elasticdump
Import Json into ES
elasticdump \
--input=./CombinedServices_IBC.json \
--output=http://127.0.0.1:9200/my_index \
--type=alias
Upvotes: 2