k.ko3n
k.ko3n

Reputation: 954

Grouping by hours without adding non-existing hours in Python

I wanted to group my df, datetime indexed, into hours. The source data interval is 5 minute, but only from 6am to 6pm, no data for night-hours.

My code is like this:

hourly= df.resample('60T').sum().sort_index().dropna(how='any')

But, the result produces extra night-time hours to make each day become complete 24 hours. It gives night-time hours zero values. I don't want it. I only need hours that respect the source data.

Please help.

Upvotes: 1

Views: 56

Answers (1)

jpp
jpp

Reputation: 164703

You can use groupby with a calculated series, in this case flooring at 1-hourly intervals:

# example dataframe
dates = ['2018-01-01 15:01:00', '2018-01-01 15:23:15', '2018-01-01 16:30:05']
df = pd.DataFrame({'date': pd.to_datetime(L), 'values': [1, 2, 3]})

res = df.groupby(df['date'].dt.floor('60min'))['values'].sum()

print(res)

date
2018-01-01 15:00:00    3
2018-01-01 16:00:00    3
Name: values, dtype: int64

Upvotes: 2

Related Questions