Wu Shanhe
Wu Shanhe

Reputation: 95

How to ignore broken JSON line in jq?

When using jq to handle log files, some lines might be broken, therefore jq throws errors and stop processing.

e.g. the complete log:

{"level":"debug","time":"2021-09-24T19:42:47.140+0800","message":"sent send binary to ws server1","pid":41491,"cid":"32likw","num":1,"count":5120}
{"level":"debug","time":"2021-09-24T19:42:47.305+0800","message":"sent send binary to ws server2","pid":41491,"cid":"32likw","num":1,"count":5120}
{"level":"debug","time":"2021-09-24T19:42:47.469+0800","message":"sent send binary to ws server3","pid":41491,"cid":"32likw","num":1,"count":5120}
{"level":"debug","time":"2021-09-24T19:42:47.499+0800","message":"sent send binary to ws server4","pid":41491,"cid":"32likw","num":1,"count":5120}
{"level":"debug","time":"2021-09-24T19:42:47.581+0800","message":"sent send binary to ws server5","pid":41491,"cid":"32likw","num":1,"count":5120}

jq handles it well:

< snippet1.json jq -C -r '.message'
sent send binary to ws server1
sent send binary to ws server2
sent send binary to ws server3
sent send binary to ws server4
sent send binary to ws server5

the broken one (last part of line 3 is missing):

{"level":"debug","time":"2021-09-24T19:42:47.140+0800","message":"sent send binary to ws server1","pid":41491,"cid":"32likw","num":1,"count":5120}
{"level":"debug","time":"2021-09-24T19:42:47.305+0800","message":"sent send binary to ws server2","pid":41491,"cid":"32likw","num":1,"count":5120}
{"level":"debug","time":"2021-09-24T19:42:47.469+0800","message":"sent send binary to ws server3","pi
{"level":"debug","time":"2021-09-24T19:42:47.499+0800","message":"sent send binary to ws server4","pid":41491,"cid":"32likw","num":1,"count":5120}
{"level":"debug","time":"2021-09-24T19:42:47.581+0800","message":"sent send binary to ws server5","pid":41491,"cid":"32likw","num":1,"count":5120}

jq stops at the broken line:

< snippet2.json jq -C -r '.message'
sent send binary to ws server1
sent send binary to ws server2
parse error: Invalid string: control characters from U+0000 through U+001F must be escaped at line 4, column 2

And I hope jq can ignore the 3rd line and continue, just like this:

< snippet2.json jq -C -r '.message'
sent send binary to ws server1
sent send binary to ws server2
sent send binary to ws server4
sent send binary to ws server5

I tried to use -R mentioned in another post, it didn't help with this case.

< snippet2.json jq -C -R -r '.message'
jq: error (at <stdin>:1): Cannot index string with string "message"
jq: error (at <stdin>:2): Cannot index string with string "message"
jq: error (at <stdin>:3): Cannot index string with string "message"
jq: error (at <stdin>:4): Cannot index string with string "message"
jq: error (at <stdin>:5): Cannot index string with string "message"

Can you please let me know if there is any solutions/skills to ignore/skip/suppress errors like this and get result of the rest?

Upvotes: 6

Views: 3721

Answers (3)

muelleth
muelleth

Reputation: 351

I also came across this issue. Briefly, here's our setup:

  • Kubernets / AKS
  • Datadog for log collection
  • Maven for Java Spring Boot with logstash-logback-encoder dependency

Now kubectl logs become pretty unreadable on the commandline. However, I think debugging in the terminal still has its advantages. So here's my most loved one-line solution for more or less resilient live tailing from a kubernetes pod and quick debugging:

kubectl logs -f --tail 200 <podname> | jq -R 'fromjson? | . | "\(.["@timestamp"]) -- \(.level) -- \(.logger_name) -- \(.message) \(.stack_trace)"' -r

This will ignore/hide lines with parse errors and show some of the JSON fields of each log line using the format string above.

It's not perfect, but will serve as a starting point I guess. Adapt to your needs :-)

Upvotes: -1

Hyori
Hyori

Reputation: 306

A few more explanations to peak's answer. (Credits go to peak)

Solution #1:

❯ cat bad.json | jq -r -R 'fromjson? | .message'
sent send binary to ws server1
sent send binary to ws server2
sent send binary to ws server4
sent send binary to ws server5

Solution #2:

❯ cat bad.json | jq -r -R '. as $line | try fromjson catch $line | .message'
sent send binary to ws server1
sent send binary to ws server2
jq: error (at <stdin>:3): Cannot index string with string "message"
sent send binary to ws server4
sent send binary to ws server5

jq still output errors, but it's on stderr, and you can redirect it:

❯ cat bad.json | jq -r -R '. as $line | try fromjson catch $line | .message' 2>/dev/null
sent send binary to ws server1
sent send binary to ws server2
sent send binary to ws server4
sent send binary to ws server5

It's notable that it -R and-r can be used together. (Thanks for @peak !)

Upvotes: 4

peak
peak

Reputation: 116670

To skip the broken lines you could use:

jq -Rr 'fromjson? | .message'

If you want to do something else with them, you could start with something like:

jq -R '. as $line | try fromjson catch $line'

For other options, see:

𝑸: Is there a way to have jq keep going after it hits an error in the input file? Can jq handle broken JSON?

in the jq FAQ.

Upvotes: 8

Related Questions