mike
mike

Reputation: 373

jq convert comma separated JSON objects into array

I have files which have N JSON objects and they are separated by comma (,)

{"a":1},{"b":2},{"c":3},{"d":2},{"e":1},{"f":2} ...

I would like to convert them into one JSON array with N objects using jq

[{"a":1},{"b":2},{"c":3},{"d":2},{"e":1},{"f":2} ...]

I tried jq -R 'split(",")' myfile.json but it gives me an array of N strings

[
  "{\"a\":1}",
  "{\"b\":2}",
  "{\"a\":1}",
  "{\"b\":2}",
  "{\"a\":1}",
  "{\"b\":2}",
  "{\"a\":1}",
  "{\"b\":2}" ....
]

Any idea?

Upvotes: 2

Views: 6996

Answers (2)

peak
peak

Reputation: 116690

Since you have millions of these JSON objects, let me first suggest an efficient way to produce a stream of them in the JSON-Lines format (i.e., with "newline" as the delimiter).

WARNING: THE FOLLOWING ASSUMES THAT THE OBJECTS DO NOT CONTAIN JSON STRINGS WITH COMMAS.

Let's assume the comma-separated objects are in a file named objects.txt. First, create a file, program.jq, with the following jq program:

def one:
  (try input catch null)
  | if . == 0 then empty elif . == null then one else (., one) end;

one

Then assuming your shell allows it, the invocation:

 (cat objects.txt; echo 0) |
   sed $'s/,/,\\\n/g' | 
   jq -n -c -f program.jq objects.txt

will produce the stream, one JSON object per line. This is a very manageable format. For example, to produce an array, you could pipe the above-mentioned stream into jq -s .

However, if the goal is solely to produce a JSON array, then as pointed out elsewhere, the most efficient approach would be to enclose the comma-separated objects in square brackets, along the lines of:

(echo "["; cat objects.txt; echo "]")

So the relevant question here, perhaps, is: what's the real goal? It seems doubtful that having an unmanageably large array of small JSON objects is likely to more useful than either the original comma-separated sequence, or a simple stream.

Upvotes: 1

Thor
Thor

Reputation: 47099

You are on the right track, you just need to map fromjson to the array, e.g.:

jq -Rc 'split(",") | map(fromjson)' myfile.json

Output:

[{"a":1},{"b":2},{"c":3},{"d":2},{"e":1},{"f":2}]

However, if you are dealing with huge inputs, perhaps use a more streamable command to split the input into chunks, e.g. with tr:

<myfile.json tr ',' '\n' | jq -c .

Output:

{"a":1}
{"b":2}
{"c":3}
{"d":2}
{"e":1}
{"f":2}

Upvotes: 1

Related Questions