J_yang
J_yang

Reputation: 2842

How to parse json file to DataFrame in Python

I have some json data of phone accelerometer data. Which looks like this:

{u'timestamps': {u'1524771017235': [[u'x',
                                     u'y',
                                     u'z',
                                     u'rotationX',
                                     u'rotationY',
                                     u'rotationZ'],
                                    [-0.02, 0, 0.04, 102.65, 68.15, 108.61],
                                    [-0.03, 0.02, 0.02, 102.63, 68.2, 108.5],
                                    [-0.05, 0.01, 0.1, 102.6, 68.25, 108.4],
                                    [-0.02, 0, 0.09, 102.6, 68.25, 108.4],
                                    [-0.01, 0, 0.03, 102.6, 68.25, 108.4]]}}

What I want is to have a dataFrame with columns of the name of the data (x, y,z rotationX, rotationY, rotationZ) and the row of each data entry. The timestamp information can be stored elsewhere.

When I used d = pd.read_json('data.json'), this is what I get:

timestamps
2018-04-26 19:30:17.235 [[x, y, z, rotationX, rotationY, rotationZ], [...

It seems it takes the timestamps as index. And put all the rest in 1 cell.

I dont have much experience with json so the couldn't really make much sense out of the pandas.read_json api. please help.

My current workaround is to manually skip through the first 2 dictionaries. And create a df with the first column as the headers. It works but it is really not ideal...

dataDf = pd.DataFrame(data = d['timestamps']['1524771017235'][1:], columns = d['timestamps']['1524771017235'][0])

x   y   z   rotationX   rotationY   rotationZ
0   -0.02   0.00    0.04    102.65  68.15   108.61
1   -0.03   0.02    0.02    102.63  68.20   108.50
2   -0.05   0.01    0.10    102.60  68.25   108.40

Thanks

Upvotes: 1

Views: 88

Answers (1)

Ben.T
Ben.T

Reputation: 29635

What you need is to get access to the key of the dictionary {u'1524771017235': [[u'x', ... being the value associated to the key timestamps of the dictionary d associated to your json file. Then try:

d['timestamps'].keys()[0]

and it should return your '1524771017235' value, so to create your dataDf just do:

dataDf = pd.DataFrame(data = d['timestamps'][d['timestamps'].keys()[0]][1:], 
                      columns = d['timestamps'][d['timestamps'].keys()[0]][0])

and you get the same result.

Upvotes: 1

Related Questions