Reputation: 137
I have a Panadas DataFrame containing the measurements (Val) of various operations (OpID) with the time-stamp for when the measurement occurred (OpTime).
OpID OpTime Val
143 2014-01-01 02:35:02 20
143 2014-01-01 02:40:01 24
143 2014-01-01 02:40:03 0
143 2014-01-01 02:45:01 0
143 2014-01-01 02:50:01 20
143 2014-01-01 02:55:01 0
143 2014-01-01 03:00:01 20
143 2014-01-01 03:05:01 24
143 2014-01-01 03:10:01 20
212 2014-01-01 02:15:01 20
212 2014-01-01 02:17:02 0
212 2014-01-01 02:20:01 0
212 2014-01-01 02:25:01 0
212 2014-01-01 02:30:01 20
299 2014-01-01 03:30:03 33
299 2014-01-01 03:35:02 33
299 2014-01-01 03:40:01 34
299 2014-01-01 03:45:01 33
299 2014-01-01 03:45:02 34
My desire to to generate an output which only shows the time periods in which the measurement returned zero. The Start column would show the first in a series of OpTimes which resulted in zero and End would contain the OpTime of the first Val that was non-zero.
Given the above sample data my desired result is:
OpID Start End
143 2014-01-01 02:40:03 2014-01-01 02:50:01
143 2014-01-01 02:55:01 2014-01-01 03:00:01
212 2014-01-01 02:17:02 2014-01-01 02:30:01
Upvotes: 0
Views: 511
Reputation: 54330
How about using pivot()
?
import numpy as np
import pandas as pd
df['Zeros'] = (df['Val']==0)
df['Valdf'] = np.hstack((nan, diff(df['Zeros'].values))) #how do you treat the first cell?
df['Valdr'] = np.hstack((diff(df['Zeros'].values), nan)) #how do you treat the last cell?
pf2 = pd.concat([df[((df['Zeros']!=True)&(df['Valdf']==1))],df[((df['Zeros'])&(df['Valdr']==1))]]).sort_index()
pf2['State'] = np.where(pf2['Zeros'], 'Start', 'End')
pf2['idx2'] = pf2['OpID'] + np.arange(len(pf2))/2*0.00000001 #need unique index for .pivot()
print pf2.pivot(index='idx2', columns='State', values='OpTime')
State End Start
idx2
143.0 2014-01-01 02:50:01 2014-01-01 02:45:01
143.1 2014-01-01 03:00:01 2014-01-01 02:55:01
212.2 2014-01-01 02:30:01 2014-01-01 02:25:01
To get the one you presented, probably just:
pf3 = pd.DataFrame(pf2.pivot(index='idx2', columns='State', values='OpTime'))
pf3.index=asarray(pf3.index.values, 'int')
Upvotes: 1