sooon
sooon

Reputation: 4888

Python Pandas - Convert dataframe into json

I have this pandas.dataframe:

    date.               pid value   interval
0   2021-09-05 00:04:24 1       5.554   2021-09-05 00:00:00
1   2021-09-05 00:06:38 1       4.359   2021-09-05 00:05:00
2   2021-09-05 00:06:46 1       18.364  2021-09-05 00:05:00
3   2021-09-05 00:04:24 2       15.554  2021-09-05 00:00:00
4   2021-09-05 00:06:38 2       3.359   2021-09-05 00:05:00
5   2021-09-05 00:06:46 2       10.364  2021-09-05 00:05:00

which I want to turn it into JSON like this:

{
    "2021-09-05 00:00:00": {
        "pid1": [
            {
                "date": "2021-09-05 00:04:24",
                "pid": 1,
                "value": 5.554,
            },
        ],
        "pid2": [
            {
                "date": "2021-09-05 00:04:24",
                "pid": 2,
                "value": 15.554,
            }
        ],
    },
"2021-09-05 00:05:00": {
        "pid1": [
            {
                "date": "2021-09-05 00:04:24",
                "pid": 1,
                "value": 4.359,
            },
            {
                "date": "2021-09-05 00:04:24",
                "pid": 1,
                "value": 18.364,
            },

        ],
        "pid2": [
            {
                "date": "2021-09-05 00:06:38",
                "pid": 2,
                "value": 3.359,
            },{
                "date": "2021-09-05 00:06:46",
                "pid": 1,
                "value": 10.364,
            },
        ],
    }
}

Basically I want the group the data by the interval value. Is there a quick way to format this?

Upvotes: 3

Views: 84

Answers (1)

jezrael
jezrael

Reputation: 863611

Create helper column with pid, convert to MultiIndex Series and last crate nested dictionary:

s = (df.assign(new = 'pid' + df['pid'].astype(str))
       .groupby(['interval','new'])[['date','pid','value']]
       .apply(lambda x : x.to_dict(orient= 'records')))

d = {level: s.xs(level).to_dict() for level in s.index.levels[0]}

print (d)

{
    '2021-09-05 00:00:00': {
        'pid1': [{
            'date': '2021-09-05 00:04:24',
            'pid': 1,
            'value': 5.554
        }],
        'pid2': [{
            'date': '2021-09-05 00:04:24',
            'pid': 2,
            'value': 15.554
        }]
    },
    '2021-09-05 00:05:00': {
        'pid1': [{
                'date': '2021-09-05 00:06:38',
                'pid': 1,
                'value': 4.359
            },
            {
                'date': '2021-09-05 00:06:46',
                'pid': 1,
                'value': 18.364
            }
        ],
        'pid2': [{
                'date': '2021-09-05 00:06:38',
                'pid': 2,
                'value': 3.359
            },
            {
                'date': '2021-09-05 00:06:46',
                'pid': 2,
                'value': 10.364
            }
        ]
    }
}

Last for json use:

import json
json = json.dumps(d)

Upvotes: 3

Related Questions