Dmitry Ermolov
Dmitry Ermolov

Reputation: 2237

Processing huge json-array files with jq

I have huge (~7GB) json array of relatively small objects.

Is there relatively simple way to filter these objects without loading whole file into memory?

--stream option looks suitable, but I can't figure out how to fold stream of [path,value] to original objects.

Upvotes: 18

Views: 15078

Answers (1)

peak
peak

Reputation: 116680

jq 1.5 has a streaming parser. The jq FAQ gives an example of how to convert a top-level array of JSON objects into a stream of its elements:

$ jq -nc --stream 'fromstream(1|truncate_stream(inputs))'
[{"foo":"bar"},{"foo":"baz"}]
{"foo":"bar"}
{"foo":"baz"}

That may be enough for your purposes, but it is worthwhile noting that setpath/2 can be helpful. Here's how to produce a stream of leaflets:

jq -c --stream '. as $in | select(length == 2) | {}|setpath($in[0]; $in[1])'

Further information and documentation is available in the jq manual: https://stedolan.github.io/jq/manual/#Streaming

Upvotes: 16

Related Questions