Nikita Gupta
Nikita Gupta

Reputation: 513

convert JSON to python dataframe

I have JSON of the form:

{
    "abc":
      {
        "123":[45600,null,3567],
        "378":[78689,2345,5678],
        "343":[23456,null,null]
      }
    }

I have got json data from a url using below way:

json_data = json.loads(url.read().decode())

I need to convert it into Python Pandas Dataframe as below:

ds    y_ds1  y_ds2  y_ds2
123   45600  null   3567
378   78689  2345   5678
343   23456  null   null

I'm trying to do this way :

df = pd.read_json(url,orient='columns')

It gives result in following form:

          abc
123      [45600,null,3567]
378      [78689,2345,5678]
343      [23456,null,null]

Any way by which I can simply split my second column of array into no. of columns present in the array and rename the headings as mentioned above?

Edits: According to the given json, there are 3 elements in all the arrays; what if the array size is 2 or 4 or 5 for all the items. Note: JSON will have all keys with similar sized arrays. But size of array can be anything.

Upvotes: 2

Views: 376

Answers (1)

gold_cy
gold_cy

Reputation: 14216

Does this work for you?

import pandas as pd
import numpy as np

null = np.nan

my_json = {
    "abc":
      {
        "123":[45600,null,3567],
        "378":[78689,2345,5678],
        "343":[23456,null,null]
      }
    }

pd.DataFrame(my_json.get('abc')).T.reset_index().rename(columns={'index':'ds',0:'y_ds1',1:'y_ds2',2:'y_ds3'})

    ds    y_ds1   y_ds2   y_ds2
0  123  45600.0     NaN  3567.0
1  343  23456.0     NaN     NaN
2  378  78689.0  2345.0  5678.0

If the index column can remain the ds column than you can do this:

pd.DataFrame(my_json.get('abc')).T.rename(columns=(lambda x: 'y_ds' + str(x)))

       y_ds0   y_ds1   y_ds2
123  45600.0     NaN  3567.0
343  23456.0     NaN     NaN
378  78689.0  2345.0  5678.0

Edit: Given the DF you presented in your edit you can convert it as so:

temp = df['abc'].apply(lambda x: pd.Series(x)).rename(columns=(lambda x: :'y_ds'+str(x)))

temp

       y_ds0   y_ds1   y_ds2
123  45600.0     NaN  3567.0
378  78689.0  2345.0  5678.0
343  23456.0     NaN     NaN

Upvotes: 2

Related Questions