Waqar Ahmed
Waqar Ahmed

Reputation: 55

Generate a separate CSV record for each array element

I have a JSON:

{
  "Country": "USA",
  "State": "TX",
  "Employees": [
    {
      "Name": "Name1",
      "address": "SomeAdress1"
    }
  ]
}
{
  "Country": "USA",
  "State": "FL",
  "Employees": [
    {
      "Name": "Name2",
      "address": "SomeAdress2"
    },
    {
      "Name": "Name3",
      "address": "SomeAdress3"
    }
  ]
}
{
  "Country": "USA",
  "State": "CA",
  "Employees": [
    {
      "Name": "Name4",
      "address": "SomeAdress4"
    }
  ]
}

I want to use jq to get the following result in csv format:

Country, State, Name, Address

USA, TX, Name1, SomeAdress1
USA, FL, Name2, SomeAdress2
USA, FL, Name3, SomeAdress3
USA, CA, Name4, SomeAdress4

I have got the following jq:

jq -r '.|[.Country,.State,(.Employees[]|.Name,.address)] | @csv'

And I get the following with 2nd line having more columns than required. I want these extra columns in a separate row:

"USA","TX","Name1","SomeAdress1"
"USA","FL","Name2","SomeAdress2","Name3","SomeAdress3"
"USA","CA","Name4","SomeAdress4"

And I want the following result:

"USA","TX","Name1","SomeAdress1"
"USA","FL","Name2","SomeAdress2"
"USA","FL","Name3","SomeAdress3"
"USA","CA","Name4","SomeAdress4"

Upvotes: 0

Views: 56

Answers (3)

Weeble
Weeble

Reputation: 17920

The other answers are good, but I want to talk about why your attempt doesn't work, as well as why it seems like it should.

You are wondering why this:

jq -r '.|[.Country,.State,(.Employees[]|.Name,.address)] | @csv'

produces this:

"USA","TX","Name1","SomeAdress1"
"USA","FL","Name2","SomeAdress2","Name3","SomeAdress3"
"USA","CA","Name4","SomeAdress4"

perhaps because this:

jq '{Country:.Country,State:.State,Name:(.Employees[]|.Name)}'

produces this:

{
  "Country": "USA",
  "State": "TX",
  "Name": "Name1"
}
{
  "Country": "USA",
  "State": "FL",
  "Name": "Name2"
}
{
  "Country": "USA",
  "State": "FL",
  "Name": "Name3"
}
{
  "Country": "USA",
  "State": "CA",
  "Name": "Name4"
}

It turns out the difference is in what exactly [...] and {...} do in a jq filter. In the array constructor [...], the entire contents of the square brackets, commas and all, is a single filter, which is fully evaluated and all the results combined into one array. Each comma inside is simply the sequencing operator, which means generate all the values from the filter on its left, then all the values from the filter on its right. In contrast, the commas in the {...} object constructor are part of the syntax and just separate the fields of the object. If any of the field expressions yield multiple values then multiple whole objects are produced. If multiple field expressions yield multiple value then you get a whole object for every combination of yielded values.

When you do this:

jq -r '.|[.Country,.State,(.Employees[]|.Name,.address)] | @csv'
                  ^      ^                   ^
                  1      2                   3

the problem is that the commas labelled "1", "2" and "3" are all doing the same thing, evaluating all the values for the filter on the left, then all the values for the filter on the right. Then the array constructor catches all of them and produces a single array. The array constructor will never create more than one array for one input.

So with that in mind, you need to make sure that where you're expanding out .Employees[] isn't inside your array constructor. Here's another option to add to the answers you already have:

jq -r '.Employee=.Employees[]|[.Country,.State,.Employee.Name,.Employee.address]|@csv'

demo

or indeed:

jq -r '.Employees[] as $e|[.Country,.State,$e.Name,$e.address]|@csv'

demo

Upvotes: 0

oguz ismail
oguz ismail

Reputation: 50775

You need to generate a separate array for each employee.

[.Country, .State] + (.Employees[] | [.Name, .address]) | @csv

Online demo

Upvotes: 1

Andreas Louv
Andreas Louv

Reputation: 47099

You can store root object in a variable, and then expand the Employees arrays:

$ jq -r '. as $root | .Employees[]|[$root.Country, $root.State, .Name, .address] | @csv'
"USA","TX","Name1","SomeAdress1"
"USA","FL","Name2","SomeAdress2"
"USA","FL","Name3","SomeAdress3"
"USA","CA","Name4","SomeAdress4"

Upvotes: 1

Related Questions