Minseven
Minseven

Reputation: 142

pandas column to list for a json file

from a Dataframe, I want to have a JSON output file with one key having a list:

Expected output:

[
  {
    "model": "xx",
    "id": 1,
    "name": "xyz",
    "categories": [1,2],
  },
  {
    ...
  },
]

What I have:

[
  {
    "model": "xx",
    "id": 1,
    "name": "xyz",
    "categories": "1,2",
  },
  {
    ...
  },
]

The actual code is :

df = pd.read_excel('data_threated.xlsx')
result = df.reset_index(drop=True).to_json("output_json.json", orient='records')
parsed = json.dumps(result)

jsonfile = open("output_json.json", 'r')
data = json.load(jsonfile)

How can I achive this easily?

EDIT:

print(df['categories'].unique().tolist())

['1,2,3', 1, nan, '1,2,3,6', 9, 8, 11, 4, 5, 2, '1,2,3,4,5,6,7,8,9']

Upvotes: 1

Views: 410

Answers (1)

Corralien
Corralien

Reputation: 120559

You can use:

df = pd.read_excel('data_threated.xlsx').reset_index(drop=True)
df['categories'] = df['categories'].apply(lambda x: [int(i) for i in x.split(',')] if isinstance(x, str) else '')
df.to_json('output.json', orient='records', indent=4)

Content of output.json

[
    {
        "model":"xx",
        "id":1,
        "name":"xyz",
        "categories":[
            1,
            2
        ]
    }
]

Note you can also use:

df['categories'] = pd.eval(df['categories'])

Upvotes: 1

Related Questions