Difference of value between two different times at the same date

Question

I have a dataframe df as below:

Datetime                Value
2020-03-01 08:00:00      10
2020-03-01 10:00:00      12
2020-03-01 12:00:00      15
2020-03-02 09:00:00       1
2020-03-02 10:00:00       3
2020-03-02 13:00:00       8
2020-03-03 10:00:00      20
2020-03-03 12:00:00      25
2020-03-03 14:00:00      15

I would like to calculate the difference between the value on the first time of each date and the last time of each date (ignoring the value of other time within a date), so the result will be:

Datetime      Value_Difference
2020-03-01          5
2020-03-02          7
2020-03-03         -5

I have been doing this using a for loop, but it is slow (as expected) when I have larger data. Any help will be appreciated.

Shaido · Accepted Answer

One solution would be to make sure the data is sorted by time, group by the data and then take the first and last value in each day. This works since pandas will preserve the order during groupby, see e.g. here.

df = df.sort_values(by='Datetime').groupby(df['Datetime'].dt.date).agg({'Value': ['first', 'last']})
df['Value_Difference'] = df['Value']['last'] - df['Value']['first']
df = df.drop('Value', axis=1).reset_index()

Result:

Datetime      Value_Difference
2020-03-01          5
2020-03-02          7
2020-03-03         -5

Difference of value between two different times at the same date

Answers (2)

Related Questions