Reputation: 165
I am trying to incrementally build a rolling "minimum" column which gradually increases in value from a minimum value up to halfway between the initial minimum and maximum values, but by group, and over a number of days in a Pandas DataFrame. The maximum value should stay the same over time. Picture a control chart, where the upper bound remains a flat line, and the lower bound linearly rises up to end halfway between the initial min and max bounds.
Here is code that does what I want in vanilla Python (without the grouping).
starting_min = 50
starting_max = 100
days = 10
a = []
for day in range(days+1):
minimum = starting_min + (day/days)*(starting_max-starting_min)/2
a.append(minimum)
Which generates the following:
Day | Minimum |
---|---|
0 | 50.0 |
1 | 52.5 |
2 | 55.0 |
3 | 57.5 |
4 | 60.0 |
5 | 62.5 |
6 | 65.0 |
7 | 67.5 |
8 | 70.0 |
9 | 72.5 |
10 | 75.0 |
Here is some code to start working towards the desired DataFrame:
data = {'Group': ['A','B'], 'Minimum': [50,30], 'Maximum':[100,150]}
df = pd.DataFrame(data)
days = {'Days': [1,2,3,4,5,6,7,8,9,10]}
df_rolling = pd.DataFrame(columns = ['Day','Group','Minimum','Maximum'])
I would like df_rolling to become the following DataFrame:
Day | Group | Minimum | Maximum |
---|---|---|---|
0 | A | 50.0 | 100 |
1 | A | 52.5 | 100 |
2 | A | 55.0 | 100 |
3 | A | 57.5 | 100 |
4 | A | 60.0 | 100 |
5 | A | 62.5 | 100 |
6 | A | 65.0 | 100 |
7 | A | 67.5 | 100 |
8 | A | 70.0 | 100 |
9 | A | 72.5 | 100 |
10 | A | 75.0 | 100 |
0 | B | 30.0 | 150 |
1 | B | 36.5 | 150 |
2 | B | 42.0 | 150 |
3 | B | 48.5 | 150 |
4 | B | 54.0 | 150 |
5 | B | 60.5 | 150 |
6 | B | 66.0 | 150 |
7 | B | 72.0 | 150 |
8 | B | 78.0 | 150 |
9 | B | 84.0 | 150 |
10 | B | 90.0 | 150 |
Much appreciate any help!
Upvotes: 1
Views: 408
Reputation: 35646
Try creating a Multi-Index.from_product
with the Groups and the days.
Then use set_index
+ reindex
to apply the MultiIndex to the frame. Use method='ffill'
to populate the starting values down the frame.
days = 10
midx = pd.MultiIndex.from_product((df['Group'], range(days + 1)),
names=['Group', 'Day'])
df_rolling = (
df.assign(i=0)
.set_index(['Group', 'i'])
.reindex(midx, method='ffill')
.reset_index()
)
This will create a DataFrame like:
df_rolling
:
Group Day Minimum Maximum
0 A 0 50 100
1 A 1 50 100
2 A 2 50 100
3 A 3 50 100
4 A 4 50 100
5 A 5 50 100
6 A 6 50 100
7 A 7 50 100
8 A 8 50 100
9 A 9 50 100
10 A 10 50 100
11 B 0 30 150
12 B 1 30 150
13 B 2 30 150
14 B 3 30 150
15 B 4 30 150
16 B 5 30 150
17 B 6 30 150
18 B 7 30 150
19 B 8 30 150
20 B 9 30 150
21 B 10 30 150
Then use math operations on this new frame:
df_rolling['Minimum'] = (
df_rolling['Minimum'] +
(df_rolling['Day'] / days) *
(df_rolling['Maximum'] - df_rolling['Minimum']) /
2
)
df_rolling
:
Group Day Minimum Maximum
0 A 0 50.0 100
1 A 1 52.5 100
2 A 2 55.0 100
3 A 3 57.5 100
4 A 4 60.0 100
5 A 5 62.5 100
6 A 6 65.0 100
7 A 7 67.5 100
8 A 8 70.0 100
9 A 9 72.5 100
10 A 10 75.0 100
11 B 0 30.0 150
12 B 1 36.0 150
13 B 2 42.0 150
14 B 3 48.0 150
15 B 4 54.0 150
16 B 5 60.0 150
17 B 6 66.0 150
18 B 7 72.0 150
19 B 8 78.0 150
20 B 9 84.0 150
21 B 10 90.0 150
Complete Working Example:
import pandas as pd
df = pd.DataFrame({
'Group': ['A', 'B'],
'Minimum': [50, 30],
'Maximum': [100, 150]
})
days = 10
midx = pd.MultiIndex.from_product((df['Group'], range(days + 1)),
names=['Group', 'Day'])
df_rolling = (
df.assign(i=0)
.set_index(['Group', 'i'])
.reindex(midx, method='ffill')
.reset_index()
)
df_rolling['Minimum'] = (
df_rolling['Minimum'] +
(df_rolling['Day'] / days) *
(df_rolling['Maximum'] - df_rolling['Minimum']) /
2
)
print(df_rolling)
Upvotes: 1