Reputation: 382
I have a rather large json file containing several million geojson points. I'm able to read it in with JSONParser without a problem. I'm then attempting to process the file with geojson-vt.
For the test file set I used that was 700mb this worked fine when I set the --max_old_space_size v8 parameter to 8gb or so. Now I'm trying to work with a 3gb full file for a single state and running into issues. It seems no matter how high I set the max_old_space_size parameter it will only use around 34gb before the garbage collector kills it on an allocation failure, even though I set max_old_space_size to 50gb.
I've read about some other v8 parameters that could potentially be used. Here is the last command I attempted to run:
node --max_executable_size=5000 --max_old_space_size=54000 --max-semi-space-size=5000 -nouse-idle-notification --noconcurrent_sweeping app.js
Here is the failure from that command.
<--- Last few GCs --->
[27445:0x2e900d0] 587348 ms: Scavenge 29492.1 (31038.4) -> 29422.8 (31521.9) MB, 2092.6 / 0.0 ms allocation failure
[27445:0x2e900d0] 591039 ms: Scavenge 30244.6 (31803.4) -> 30175.9 (32273.4) MB, 2070.4 / 0.0 ms allocation failure
[27445:0x2e900d0] 594706 ms: Scavenge 30972.9 (32544.4) -> 30904.8 (33028.4) MB, 2060.4 / 0.0 ms allocation failure
[27445:0x2e900d0] 620992 ms: Scavenge 31727.0 (33311.4) -> 31656.7 (2783311.9) MB, 24589.5 / 0.0 ms allocation failure
<--- JS stacktrace --->
Cannot get stack trace in GC.
FATAL ERROR: NewSpace::Rebalance Allocation failed - process out of memory
1: node::Abort() [node]
2: 0x12299bc [node]
3: v8::Utils::ReportOOMFailure(char const*, bool) [node]
4: v8::internal::V8::FatalProcessOutOfMemory(char const*, bool) [node]
5: 0xa6b34b [node]
6: v8::internal::MarkCompactCollector::EvacuateNewSpaceAndCandidates() [node]
7: v8::internal::MarkCompactCollector::CollectGarbage() [node]
8: v8::internal::Heap::MarkCompact() [node]
9: v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [node]
10: v8::internal::Heap::CollectGarbage(v8::internal::GarbageCollector, v8::internal::GarbageCollectionReason, char const*, v8::GCCallbackFlags) [node]
11: v8::internal::Factory::NewFillerObject(int, bool, v8::internal::AllocationSpace) [node]
12: v8::internal::Runtime_AllocateInTargetSpace(int, v8::internal::Object**, v8::internal::Isolate*) [node]
13: 0x1a296258ed46
Aborted
It seems like no matter what it won't grow past this limit. Are there other parameters I can set to allow it to grow more and GC less?
I realize this isn't the typical use case for node.js and it's probably not the best tool for the job, but if I can get geojson-vt to work then there's some other libraries that can make processing this data set much easier so I'd like to go this route if it's possible.
I have basically unlimited memory that can be used for this task; up to a few TBs, so that's not going to be a limiting factor. This dataset I'm using is actually a very small part of a larger one that will need to be processed so I'm going to have to scale beyond where I am now.
Environment details:
Upvotes: 1
Views: 2302
Reputation: 40501
Two ideas:
Try setting only --max_old_space_size
, no other flags. (The specific failure you're seeing has to do with newspace; I'm not surprised that a semi space size of several gigabytes is causing issues, as there's no reason to make it that big.)
Try using a newer Node.js version (either 8.x, or even the vee-eight-lkgr
branch on https://github.com/v8/node for the very latest state of development). There have been some recent fixes to better support super-sized heaps.
Upvotes: 2