Resample time series data multiple variables

Question

I have some time series data (making some up) one variable is value and the other is Temperature

import numpy as np
import pandas as pd
np.random.seed(11)

rows,cols = 50000,2
data = np.random.rand(rows,cols) 
tidx = pd.date_range('2019-01-01', periods=rows, freq='T') 
df = pd.DataFrame(data, columns=['Temperature','Value'], index=tidx)

Question, How do I resample the data per day in a separate pandas df named daily_summary with 3 columns each containing:

the daily maximum value
the hour the maximum value occurred
the recorded temperature when the maximum value occurred

I know I can use this code below to find daily maximum value and the hour it occurred:

daily_summary = df.groupby(df.index.normalize())['Value'].agg(['idxmax', 'max']) 
daily_summary['hour'] = daily_summary['idxmax'].dt.hour
daily_summary = daily_summary.drop(['idxmax'], axis=1)
daily_summary.rename(columns = {'max':'DailyMaxValue'}, inplace = True)

But I am lost trying to incorporate what the temperature was during these daily recordings of the maximum value...

Would using .loc be a better method where a loop could just filter thru each day... Something like this???

for idx, days in df.groupby(df.index.date):
    print(days)
    daily_summary = df.loc[days['Value'].max().astype('int')]

If I run this I can print each day days but the daily_summary will throw a TypeError: cannot do index indexing on with these indexers [0] of

Any tips greatly appreciated

Quang Hoang · Accepted Answer

You can resolve to idxmax and loc:

idx = df.groupby(df.index.normalize())['Value'].idxmax()
ret_df = df.loc[idx].copy()

# get the hour
ret_df['hour'] = ret_df.index.hour

# set date as index
ret_df.index = ret_df.index.normalize()

Output:

            Temperature     Value  hour
2019-01-01     0.423320  0.998377    19
2019-01-02     0.117154  0.999976    10
2019-01-03     0.712291  0.999497    16
2019-01-04     0.404229  0.999996    21
2019-01-05     0.457618  0.999371    17

Resample time series data multiple variables

Answers (1)

Related Questions