Reputation: 23
I have this data frame:
date_time value
1/10/2016 0:00:00 28.4
1/10/2016 0:05:00 28.4
1/10/2016 0:10:00 28.4
1/11/2016 0:00:00 27.4
1/11/2016 0:05:00 27.4
1/11/2016 0:10:00 27.4
I want to calculate the difference between two rows in the same timestamp everyday, then add new calculated data to a new column like this:
date_time value change24h
1/10/2016 0:00:00 28.4 NaN
1/10/2016 0:05:00 28.4 NaN
1/10/2016 0:10:00 28.4 NaN
1/11/2016 0:00:00 27.4 -1
1/11/2016 0:05:00 29.4 +1
1/11/2016 0:10:00 28.4 0
I don't want to add new column with last day data then calculate the change. Is there other way for this? Thanks in advance!
Upvotes: 1
Views: 154
Reputation: 862511
You can use:
#convert column to datetime and create index
df = df.set_index(pd.to_datetime(df['date_time']))
#shift by 1 day and reindex for same indices, subtract from right by rsub
df['change24h'] = df['value'].shift(freq='1D').reindex(df.index).rsub(df['value'])
#same as
#df['change24h'] = df['value'].sub(df['value'].shift(freq='1D').reindex(df.index))
#remove datetime index
df = df.reset_index(drop=True)
print (df)
date_time value change24h
0 1/10/2016 0:00:00 28.4 NaN
1 1/10/2016 0:05:00 28.4 NaN
2 1/10/2016 0:10:00 28.4 NaN
3 1/11/2016 0:00:00 27.4 -1.0
4 1/11/2016 0:05:00 29.4 1.0
5 1/11/2016 0:10:00 28.4 0.0
Upvotes: 1