dg99
dg99

Reputation: 5663

Legal change to JSON input invalidates simple jq

Another department continually updates a JSON file that I then query. Its format is three lists of similar-looking dictionaries:

{
"levels":
[
{"a":1, "b":False, "c":"2012", "d":"2017"}
,{"a":2, "b":True,  "c":"2013", "d":"9999"}
,...
]
,"costs":
[
{"e":12, "f":"foo", "g":"blarg", "h":"2015", "i":"2018"}
,{"e":-3, "f":"foo", "g":"glorb", "h":"2013", "i":"9999"}
,...
]
,"recipes":
[
{"j":"BAZ", "k":["blarg","glorb","bleeg"], "l":"dill", "m":"2016", "n":"2017"}
,{"j":"BAZ", "k":["blarg","bleeg"], "l":"dill", "m":"2017", "n":"9999"}
,...
]
}   # line 3943 (see below)

Recently, my simple jq queries like

jq '.["recipes"][] | select(.l | test("ill"))' < jsonfile

stopped returning all of the results they should (e.g. returning only one of the two "dill" lines above) and started printing this error message:

jq: error (at <stdin>:3943): null (null) cannot be matched, as it is not a string

Line 3943 mentioned in the error is the final line of the file. Queries against the "levels" and "costs" sections of the file continue to work like normal; it's only the "recipes" section of the file that is breaking, as though jq thinks the closing brace of the file is still part of the "recipes" section.

To me this suggests there's been a formatting change or error in the last section of the file. However, software other than jq (e.g. python) doesn't report any problems parsing it. Before I start going through the input line by line ... does this error message indicate anything obvious to a jq expert?

Alas, I do not keep old versions of the file around for comparison. (I think I will start today.)

Upvotes: 0

Views: 123

Answers (1)

dg99
dg99

Reputation: 5663

(self-answering after a bit of investigating)

I think there was no formatting error or change in formatting in the input.

I don't know why my query syntax did not encounter errors previously (maybe I just did not notice), but it seems that the entries in the "recipes" section often do not contain an "l" attribute, and jq will cease processing as soon as it encounters one that does not.

I also don't know why jq does not generate the same error message for every record that lacks that attribute, nor why it waits to the final line of the input to generate the single message. (Maybe that behavior is documented somewhere.)

In any case, I fixed the error (not just the message, but also the failure to display all relevent records) by testing for the presence of the attribute first:

jq '.["recipes"][] | select(has("l") and (.l | test("ill")))' < jsonfile

Upvotes: -1

Related Questions