Kallol
Kallol

Reputation: 2189

How to create a json from pandas data frame where columns are the key

I have a data frame df

df:
col1    col2  col3
 1        2     3
 4        5     6
 7        8     9

The json I am looking for is:

 {
            "col1": 1,
            "col1": 4,
            "col1": 7,
        },
        {
            "col2": 2,
            "col2": 5,
            "col2": 8
        },
        {
            "col3": 3,
            "col3": 6,
            "col3": 9,
        }

I have tries df.to_json but its not working

df.to_json(orients=records)
it gives this output
'[{"col1":1,"col2":2,"col3":3},{"col1":4,"col2":5,"col3":6}, 
 {"col1":7,"col2":8,"col3":9}]

This is not the output i was looking for

How to do it in most effective way using pandas/python ?

Upvotes: 0

Views: 249

Answers (3)

Rotem Tal
Rotem Tal

Reputation: 769

JSON files are treated as dicts in python, the JSON file you specified has duplicate keys and could only be parsed as a string (and not using the python json library). The following code:

import json
from io import StringIO

df = pd.DataFrame(np.arange(1,10).reshape((3,3)), columns=['col1','col2','col3'])
io = StringIO()
df.to_json(io, orient='columns')
parsed = json.loads(io.getvalue())
with open("pretty.json", '+w') as of:
    json.dump(parsed, of, indent=4)

will produce the following JSON:

{
    "col1": {
        "0": 1,
        "1": 4,
        "2": 7
    },
    "col2": {
        "0": 2,
        "1": 5,
        "2": 8
    },
    "col3": {
        "0": 3,
        "1": 6,
        "2": 9
    }
}

which you could later load to python. alternatively, this script will produce exatcly the string you want:

with open("exact.json", "w+") as of:
    of.write('[\n\t{\n' + '\t},\n\t{\n'.join(["".join(["\t\t\"%s\": %s,\n"%(c, df[c][i]) for i in df.index]) for c in df.columns])+'\t}\n]')

and the output would be:

[
    {
        "col1": 1,
        "col1": 4,
        "col1": 7,
    },
    {
        "col2": 2,
        "col2": 5,
        "col2": 8,
    },
    {
        "col3": 3,
        "col3": 6,
        "col3": 9,
    }
]

edit: fixed brackets

Upvotes: 1

manojlds
manojlds

Reputation: 301087

You need to do

df.to_json('file.json', orient='records')

Note that this will give you a array of objects:

[
        {
            "col1": 1,
            "col1": 4,
            "col1": 7
        },
        {
            "col2": 2,
            "col2": 5,
            "col2": 8
        },
        {
            "col3": 3,
            "col3": 6,
            "col3": 9
        }
]

You can also do

df.to_json('file.json', orient='records', lines=True)

if you want output like:

{"col1":1,"col1":4,"col1":7},
{"col2":2,"col2":5,"col2":8},
{"col3":3,"col3":6,"col3":9}

To prettify output:

pip install jq
cat file.json | jq '.' > new_file.json

Upvotes: 1

hd1
hd1

Reputation: 34657

This sort of JSON, whilst valid, is not recommended, and strongly so, as during deserialization, you'll lose all but the last element of your JSON array.

Upvotes: 0

Related Questions