TuxSax
TuxSax

Reputation: 1065

Struggling with parsing JSON with jq

I've read all the posts related to it, I'm playing around with it for hours, and still can't manage to get a grip of this tool which seems to be exactly what I need if I just find a way to make it work as I need... So here's a sample of my JSON:

{
    "res": "0",
    "main": {
        "All": [
      {
        "field1": "a",
        "field2": "aa",
        "field3": "aaa",
        "field4": "0",
        "active": "true",
        "id": "1"
      },
      {
        "field1": "b",
        "field2": "bb",
        "field3": "bbb",
        "field4": "0",
        "active": "false",
        "id": "2"
      },
      {
        "field1": "c",
        "field2": "cc",
        "field3": "ccc",
        "field4": "0",
        "active": "true",
        "id": "3"
      },
      {
        "field1": "d",
        "field2": "dd",
        "field3": "ddd",
        "field4": "0",
        "active": "true",
        "id": "4"
      }
        ]
    }

}

I'd like to selectively extract some of the fields and get a csv output like this:

field1,field2,field3,id
a,aa,aaa,1
b,bb,bbb,2
c,cc,ccc,3
d,dd,ddd,4

Please notice I've skipped some fields and I'm also not interested in the parent arrays and such. Thanks a lot in advance.

Upvotes: 1

Views: 1748

Answers (2)

Mostafa Hussein
Mostafa Hussein

Reputation: 11970

First your JSON needs to be fixed as following:

{
  "main": {

  },
  "table": {
    "All": [
      {
        "field1": "a",
        "field2": "aa",
        "field3": "aaa",
        "field4": "0",
        "active": "true",
        "id": "1"
      },
      {
        "field1": "b",
        "field2": "bb",
        "field3": "bbb",
        "field4": "0",
        "active": "false",
        "id": "2"
      },
      {
        "field1": "c",
        "field2": "cc",
        "field3": "ccc",
        "field4": "0",
        "active": "true",
        "id": "3"
      },
      {
        "field1": "d",
        "field2": "dd",
        "field3": "ddd",
        "field4": "0",
        "active": "true",
        "id": "4"
      }

    ]
  },
  "res": "0"
}

Second using jq you can do the following in order to generate the table output using column:

{ echo Field1 Field2 Field3 ID ; cat data.json  | jq -r '.table.All[] | (.field1, .field2, .field3, .id)' | xargs -L4 } | column -t

Output:

Field1  Field2  Field3  ID
a       aa      aaa     1
b       bb      bbb     2
c       cc      ccc     3
d       dd      ddd     4

Using sed:

echo "field1,field2,field3,id" ;cat data.json  | jq -r '.table.All[] | (.field1, .field2, .field3, .id)' | xargs -L4 | sed 's/ /,/g'

Output:

field1,field2,field3,id
a,aa,aaa,1
b,bb,bbb,2
c,cc,ccc,3
d,dd,ddd,4

Update:

Without using sed or xargs , jq has the ability to format the output as csv like the following:

cat data.json  | jq -r '.table.All[] | [.field1, .field2, .field3, .id] | @csv'

Output:

"a","aa","aaa","1"
"b","bb","bbb","2"
"c","cc","ccc","3"
"d","dd","ddd","4"

Thanks to chepner as he mentioned in comments the header can be added using jq directly as following:

jq -r '(([["field1", "field2", "field3", "id"]]) + [(.table.All[] | [.field1,.field2,.field3,.id])])[]|@csv' data.json 

Output:

"field1","field2","field3","id"
"a","aa","aaa","1"
"b","bb","bbb","2"
"c","cc","ccc","3"
"d","dd","ddd","4"

This command should work correctly according to the last JSON data you have provided in your question:

jq -r '(([["field1", "field2", "field3", "id"]]) + [(.main.All[] | [.field1,.field2,.field3,.id])])[]|@csv' data.json

([["field1", "field2", "field3", "id"]]) : The first part of the command is for the csv header

(.main.All[] | [.field1,.field2,.field3,.id])]) : As main is the parent of your JSON then you can choose it using .main which will print the array All then to print the contents of this array you have to add [] to the name of this array and the full command will be .main.All[] which will print multiple dictionries and we can specific the needed keys by piping the out put of .main.All[] to another array with the keys we want as this [.field1,.field2,.field3,.id]

Upvotes: 3

peak
peak

Reputation: 116957

Here's an only-jq solution that only requires specifying the desired keys once, e.g. on the command line:

jq -r --argjson f '["field1", "field2", "field3", "id"]' '
  $f, (.table.All[] | [getpath( $f[]|[.])]) | @csv'

Output:

"field1","field2","field3","id"
"a","aa","aaa","1"
"b","bb","bbb","2"
"c","cc","ccc","3"
"d","dd","ddd","4"

Losing the quotation marks

One way to avoid quoting the strings would be to pipe into join(",") (or join(", ")) instead of @csv:

field1,field2,field3,id
a,aa,aaa,1
b,bb,bbb,2
c,cc,ccc,3
d,dd,ddd,4

Of course, this might be unacceptable if the values contain commas. In general, if avoiding the quotation marks around strings is important, a good option to consider is @tsv.

Upvotes: 2

Related Questions