Reputation: 9348
A data frame as below, I want to resample them by using 'time', 'cars', and 'flow' and transpose them into columns.
So I need to:
The final result should look like:
There are steps behind and I tried my best but it only ends up with:
data = {'time': ["9:17", "9:17", "9:20", "9:21", "9:22", "9:23", "9:23", "9:25", "9:26", "9:27", "9:36", "9:36"],
'tempreture': [37.7, 37.12, 37.7, 37.11, 37.9, 37.1, 37.8, 37.2, 37.6, 37.7, 37.1, 37.3],
'cars' : [737, 22, 42, 1, 45, 13, 47, 304, 10, 192, 5, 16],
'flow': ["In","Out","In","Unknown","Out","In","Out","In","Unknown","Out","In","In"]}
df = pd.DataFrame(data)
df = df[['time', 'cars', 'flow']]
idx = pd.date_range("9:00", "10:00", freq="1min")
idx = idx.rename('time')
df.index = pd.to_datetime(df.index)
df = df.reindex(idx, fill_value=0)
df = df.pivot(columns='flow').resample('30T').sum()
df = df.reset_index()
df = df.set_index('time')
df.index = pd.to_datetime(df.index)
df = df.stack().T
df["Total"] = df.sum(axis=1)
print (df)
time 2020-10-21 09:00:00 2020-10-21 09:30:00 2020-10-21 10:00:00 Total
flow 0 0 0
cars 0 0 0 0
What's the right way to write it?
Upvotes: 1
Views: 847
Reputation: 862641
Use DataFrame.pivot_table
with aggregate sum
, then DataFrame.reindex
with DataFrame.resample
and last convert to one row DataFrame by DataFrame.stack
, Series.to_frame
and transpose with flatten MultiIndex
:
df = df[['time', 'cars', 'flow']]
df = df.pivot_table(index='time',columns='flow', values='cars', aggfunc='sum')
df.index = pd.to_datetime(df.index + ':00')
idx = pd.date_range("9:00", "10:00", freq="1min", name='time')
df = df.reindex(idx, fill_value=0).resample('30T').sum()
df['Total'] = df.sum(axis=1)
df.index = df.index.strftime('%H:%M')
df = df.stack().to_frame().T
df.columns = df.columns.map(lambda x: f'{x[0]} {x[1]}')
print (df)
09:00 In 09:00 Out 09:00 Unknown 09:00 Total 09:30 In 09:30 Out \
0 1096.0 306.0 11.0 1413.0 21.0 0.0
09:30 Unknown 09:30 Total 10:00 In 10:00 Out 10:00 Unknown 10:00 Total
0 0.0 21.0 0.0 0.0 0.0 0.0
But it seems reindex
here is not necessary:
df = df.pivot_table(index='time',columns='flow', values='cars', aggfunc='sum')
df.index = pd.to_datetime(df.index + ':00')
df = df.resample('30T').sum()
df['Total'] = df.sum(axis=1)
df.index = df.index.strftime('%H:%M')
df = df.stack().to_frame().T
df.columns = df.columns.map(lambda x: f'{x[0]} {x[1]}')
Upvotes: 1