saleml
saleml

Reputation: 105

Multiple jsons to csv

I have multiple files, each containing multiple highly nested json rows. The two first rows of one such file look like:

{
"u":"28",
"evv":{
       "w":{
            "1":400,
            "2":{
                 "i":[{
                       "l":14,
                       "c":"7",
                       "p":"4"
                       }
                     ]
                 }
           }
       }
}
{
"u":"29",
"evv":{
       "w":{
            "3":400,
            "2":{
                 "i":[{
                       "c":14,
                       "y":"7",
                       "z":"4"
                       }
                     ]
                 }
           }
       }
}

they are actually rows, I just wrote them here this way for more visibility.

My question is the following:

Is there any way to convert all these files to one (or multiple, i.e. one per file) csv/excel... ?

Is there any simple way, that doesn't require writing dozens, or hundreds of lines in Python, specific to my file, to convert all these files to one (or multiple, i.e. one per file) csv/excel... ? One example would be using an external library, script... that handles this particular task, regardless of the names of the fields.

The trap is that some elements do not appear in each line. For example, for the "i" key, we have 3 fields (l, c, p) in the first json, and 3 in the second one (c, y, z). Ideally, the csv should contain as many columns as possible fields (e.g. evv.w.2.i.l, evv.w.2.i.c, evv.w.2.i.p, evv.w.2.i.y, evv.w.2.i.z) at the risk of having (many) null values per csv row.

A possible csv output for this example would have the following columns:

u, evv.w.1, evv.w.3, evv.w.2.i.l, evv.w.2.i.c, evv.w.2.i.p, evv.w.2.i.y, evv.w.2.i.z

Any idea/reference is welcome :)

Thanks

Upvotes: 1

Views: 1047

Answers (2)

Robᵩ
Robᵩ

Reputation: 168776

No, there is no general-purpose program that does precisely what you ask for.

You can, however, write a Python program that does it.

This program might do what you want. It does not have any code specific to your key names, but it is specific to your file format.

  • It can take several files on the command line.
  • Each file is presumed to have one JSON object per line.
  • It flattens the JSON object, joining labels with "."

 

import fileinput
import json
import csv


def flattify(d, key=()):
    if isinstance(d, list):
        result = {}
        for i in d:
            result.update(flattify(i, key))
        return result
    if isinstance(d, dict):
        result = {}
        for k, v in d.items():
            result.update(flattify(v, key + (k,)))
        return result
    return {key: d}

total = []
for line in fileinput.input():
    if(line.strip()):
        line = json.loads(line)
        line = flattify(line)
        line = {'.'.join(k): v for k, v in line.items()}
        total.append(line)

keys = set()
for d in total:
    keys.update(d)

with open('result.csv', 'w') as output_file:
    output_file = csv.DictWriter(output_file, sorted(keys))
    output_file.writeheader()
    output_file.writerows(total)

Upvotes: 1

Niklas Rosencrantz
Niklas Rosencrantz

Reputation: 26642

Please check if this (python3) solution works for you.

import json
import csv

with open('test.json') as data_file:
    with open('output.csv', 'w', newline='') as fp:
        for line in data_file:
            data = json.loads(line)
            output = [[data['u'], data['evv']['w'].get('1'), data['evv']['w'].get('3'),
                       data['evv']['w'].get('2')['i'][0].get('l'), data['evv']['w'].get('2')['i'][0].get('c'),
                       data['evv']['w'].get('2')['i'][0].get('p'), data['evv']['w'].get('2')['i'][0].get('y'),
                       data['evv']['w'].get('2')['i'][0].get('z')]]
            a = csv.writer(fp, delimiter=',')
            a.writerows(output)

test.json

{   "u": "28",  "evv": {        "w": {          "1": 400,           "2": {              "i": [{                 "l": 14,                    "c": "7",                   "p": "4"                }]          }       }   }}
{"u":"29","evv":{       "w":{            "3":400,            "2":{                 "i":[{                       "c":14,                       "y":"7",                       "z":"4"                       }                     ]                 }           }       }}

output

python3 pyprog.py 
dac@dac-Latitude-E7450 ~/P/pyprog> more output.csv 
28,400,,14,7,4,,
29,,400,,14,,7,4

Upvotes: 1

Related Questions