Reputation: 513
I have JSON of the form:
{
"abc":
{
"123":[45600,null,3567],
"378":[78689,2345,5678],
"343":[23456,null,null]
}
}
I have got json data from a url using below way:
json_data = json.loads(url.read().decode())
I need to convert it into Python Pandas Dataframe as below:
ds y_ds1 y_ds2 y_ds2
123 45600 null 3567
378 78689 2345 5678
343 23456 null null
I'm trying to do this way :
df = pd.read_json(url,orient='columns')
It gives result in following form:
abc
123 [45600,null,3567]
378 [78689,2345,5678]
343 [23456,null,null]
Any way by which I can simply split my second column of array into no. of columns present in the array and rename the headings as mentioned above?
Edits: According to the given json, there are 3 elements in all the arrays; what if the array size is 2 or 4 or 5 for all the items. Note: JSON will have all keys with similar sized arrays. But size of array can be anything.
Upvotes: 2
Views: 376
Reputation: 14216
Does this work for you?
import pandas as pd
import numpy as np
null = np.nan
my_json = {
"abc":
{
"123":[45600,null,3567],
"378":[78689,2345,5678],
"343":[23456,null,null]
}
}
pd.DataFrame(my_json.get('abc')).T.reset_index().rename(columns={'index':'ds',0:'y_ds1',1:'y_ds2',2:'y_ds3'})
ds y_ds1 y_ds2 y_ds2
0 123 45600.0 NaN 3567.0
1 343 23456.0 NaN NaN
2 378 78689.0 2345.0 5678.0
If the index column can remain the ds
column than you can do this:
pd.DataFrame(my_json.get('abc')).T.rename(columns=(lambda x: 'y_ds' + str(x)))
y_ds0 y_ds1 y_ds2
123 45600.0 NaN 3567.0
343 23456.0 NaN NaN
378 78689.0 2345.0 5678.0
Edit: Given the DF
you presented in your edit you can convert it as so:
temp = df['abc'].apply(lambda x: pd.Series(x)).rename(columns=(lambda x: :'y_ds'+str(x)))
temp
y_ds0 y_ds1 y_ds2
123 45600.0 NaN 3567.0
378 78689.0 2345.0 5678.0
343 23456.0 NaN NaN
Upvotes: 2