Rearranging a DataFrame

Question

I want to plot and compare sea levels from a large textfile which contains 3 types of sensors (enc,rad,prs). These sensors measured at close but different and irregular timestamps. The raw text file is like this

sensorid    sensor  stime   slevel
100 rad 2007-11-24 18:47:00 8.0710001
100 rad 2007-11-24 18:49:00 8.066
100 rad 2007-11-24 18:51:00 8.0600004
100 rad 2007-11-24 18:53:00 8.0500002
101 enc 2007-11-24 18:55:00 6.0079999
100 rad 2007-11-24 18:55:00 8.0389996
99  prs 2007-11-24 18:56:00 7.7600002
99  prs 2007-11-24 18:57:00 7.7529998

I can easily put this file into a df DataFrame which have the time as index.

df=pd.read_table('data.txt',index_col='stime',parse_dates=['stime'])

                     sensorid sensor    slevel
stime                                         
2007-11-24 18:47:00       100    rad  8.071000
2007-11-24 18:49:00       100    rad  8.066000
2007-11-24 18:51:00       100    rad  8.060000
2007-11-24 18:53:00       100    rad  8.050000
2007-11-24 18:55:00       101    enc  6.008000
2007-11-24 18:55:00       100    rad  8.039000
2007-11-24 18:56:00        99    prs  7.760000
2007-11-24 18:57:00        99    prs  7.753000
2007-11-24 18:40:00       101    enc  6.066000

But I want a DataFrame with "enc", "rad" and "prs" as the column (and the sensorid) and the sea levels as the values with NaN when the sensor didn't measure for this timestamp.

I'm looking for a clever trick to do this rearranging in a single line ... a good test for oneliners.

Sobremesa · Accepted Answer

This is a case of pivoting the df. Try something like this:

df.pivot(index='stime', columns='sensors', values='slevel')

See for further info: http://pandas.pydata.org/pandas-docs/stable/reshaping.html

Rearranging a DataFrame

Answers (1)

Related Questions