maloney
maloney

Reputation: 1653

JQ statement to build Json from csv

I have a CSV file that I want to convert to a JSON file with the quotes from the CSV removed using JQ in a shell script.

Here is the CSV named input.csv:

1,"SC1","Leeds"
2,"SC2","Barnsley"

Here is the JQ extract:

jq --slurp --raw-input --raw-output \
   'split("\n") | .[1:] | map(split(",")) |
    map({
         "ListElementCode": .[0],
         "ListElement": "\(.[1]) \(.[2])
      })' \
  input.csv > output.json

this writes to output.json:

[
  {
    "ListElementCode": "1",
    "ListElement": "\"SC1\" \"Leeds\""
  },
  {
    "ListElementCode": "2",
    "ListElement": "\"SC2\" \"Barnsley\""
  }
]

Any idea how I can remove the quotes around the 2 text values that get put into the ListElement part?

Upvotes: 1

Views: 1766

Answers (4)

peak
peak

Reputation: 116710

Here's an uncomplicated and efficient way to solve this particular problem:

jq -n --raw-input --raw-output '
  [inputs
   | split(",")
   | { "ListElementCode": .[0],
       "ListElement": "\(.[1]|fromjson) \(.[2]|fromjson)"
     } ]' input.csv 

Incidentally, there are many robust command-line CSV-to-JSON tools, amongst which I would include:

Upvotes: 1

Gilles Quénot
Gilles Quénot

Reputation: 185025

Using a proper CSV/JSON parser in :

#!/usr/bin/env perl

use strict; use warnings;

use JSON::XS;
use Text::CSV qw/csv/;

# input.csv:
#1,"SC1","Leeds"
#2,"SC2","Barnsley"
my $vars = [csv in => 'input.csv'];
#use Data::Dumper;
#print Dumper $vars; # display the data structure

my $o = [ ];
foreach my $a (@{ $vars->[0] }) {
   push @{ $o }, {
       ListElementCode => $a->[0],
       ListElement     => $a->[1] . " " . $a->[2]
   };
}

my $coder = JSON::XS->new->ascii->pretty->allow_nonref;
print $coder->encode($o);

Output

[
   {
      "ListElement" : "SC1 Leeds",
      "ListElementCode" : "1"
   },
   {
      "ListElement" : "SC2 Barnsley",
      "ListElementCode" : "2"
   }
]

Upvotes: 1

Charles Duffy
Charles Duffy

Reputation: 295353

To solve only the most immediate problem, one could write a function that strips quotes if-and-when they exist:

jq -n --raw-input --raw-output '
    def stripQuotes: capture("^\"(?<content>.*)\"$").content // .;

    [inputs | split(",") | map(stripQuotes) |
     {
         "ListElementCode": .[0],
         "ListElement": "\(.[1]) \(.[2])"
     }]
' <in.csv >out.json

That said, to really handle CSV correctly, you can't just split(","), but need to split only on commas that aren't inside quotes (and need to recognize doubled-up quotes as the escaped form of a single quote). Really, I'd use Python instead of jq for this job -- and of this writing, the jq cookbook agrees that native jq code is only suited for "trivially simple" CSV files.

Upvotes: 1

glenn jackman
glenn jackman

Reputation: 246774

As mentioned, a Ruby answer:

ruby -rjson -rcsv -e '
  data = CSV.foreach(ARGV.shift)
            .map do |row|
              {
                ListElementCode: row.first,
                ListElement: row.drop(1).join(" ")
              }
            end
  puts JSON.pretty_generate(data)
' input.csv
[
  {
    "ListElementCode": "1",
    "ListElement": "SC1 Leeds"
  },
  {
    "ListElementCode": "2",
    "ListElement": "SC2 Barnsley"
  }
]

Upvotes: 1

Related Questions