Reputation: 113
I am trying to take an aggregated time-series and break it down to its component pieces. For example, I have a time series forecast, that I would like to break down between 3 buckets (red,blue,oj)
import pandas as pd
import datetime as dt
index = pd.date_range(start = dt.date.today(), end = dt.date.today() + dt.timedelta(days = 400), freq = 'M')
df = pd.DataFrame(index = index)
df['Volume'] = [random.randint(340,950) for x in range(len(df))]
df
volume_breakdown = pd.Series({
"red":.5,
"blue":.3,
"oj":.2
})
volume_breakdown
Trying to multiply:
df * volume_breakdown
I know this is related to indexing, but is there a way to do this without using for loops and doing a concat/join/merge?
Edit: Resolved with answer below,
pd.DataFrame(
df["Volume"].values.reshape((-1, 1)) * volume_breakdown.values, index = index, columns = volume_breakdown.index
)
Upvotes: 0
Views: 67
Reputation: 120409
Use numpy
to avoid indexing problem:
df[volume_breakdown.index] = df["Volume"].values.reshape((-1, 1)) \
* volume_breakdown.values
>>> df
Volume red blue oj
2021-05-31 722 361.0 216.6 144.4
2021-06-30 530 265.0 159.0 106.0
2021-07-31 762 381.0 228.6 152.4
2021-08-31 706 353.0 211.8 141.2
2021-09-30 811 405.5 243.3 162.2
2021-10-31 908 454.0 272.4 181.6
2021-11-30 912 456.0 273.6 182.4
2021-12-31 740 370.0 222.0 148.0
2022-01-31 692 346.0 207.6 138.4
2022-02-28 798 399.0 239.4 159.6
2022-03-31 497 248.5 149.1 99.4
2022-04-30 622 311.0 186.6 124.4
2022-05-31 589 294.5 176.7 117.8
Upvotes: 2