edutt
edutt

Reputation: 165

How to Incrementally Increase from a Starting Value to a Target Value, by Group, Over a Number of Periods, in a Pandas DataFrame?

I am trying to incrementally build a rolling "minimum" column which gradually increases in value from a minimum value up to halfway between the initial minimum and maximum values, but by group, and over a number of days in a Pandas DataFrame. The maximum value should stay the same over time. Picture a control chart, where the upper bound remains a flat line, and the lower bound linearly rises up to end halfway between the initial min and max bounds.

Here is code that does what I want in vanilla Python (without the grouping).

starting_min = 50
starting_max = 100
days = 10
a = []
for day in range(days+1):
    minimum = starting_min + (day/days)*(starting_max-starting_min)/2
    a.append(minimum)

Which generates the following:

Day Minimum
0 50.0
1 52.5
2 55.0
3 57.5
4 60.0
5 62.5
6 65.0
7 67.5
8 70.0
9 72.5
10 75.0

Here is some code to start working towards the desired DataFrame:

data = {'Group': ['A','B'], 'Minimum': [50,30], 'Maximum':[100,150]}
df = pd.DataFrame(data)
days = {'Days': [1,2,3,4,5,6,7,8,9,10]}
df_rolling = pd.DataFrame(columns = ['Day','Group','Minimum','Maximum'])

I would like df_rolling to become the following DataFrame:

Day Group Minimum Maximum
0 A 50.0 100
1 A 52.5 100
2 A 55.0 100
3 A 57.5 100
4 A 60.0 100
5 A 62.5 100
6 A 65.0 100
7 A 67.5 100
8 A 70.0 100
9 A 72.5 100
10 A 75.0 100
0 B 30.0 150
1 B 36.5 150
2 B 42.0 150
3 B 48.5 150
4 B 54.0 150
5 B 60.5 150
6 B 66.0 150
7 B 72.0 150
8 B 78.0 150
9 B 84.0 150
10 B 90.0 150

Much appreciate any help!

Upvotes: 1

Views: 408

Answers (1)

Henry Ecker
Henry Ecker

Reputation: 35646

Try creating a Multi-Index.from_product with the Groups and the days.

Then use set_index + reindex to apply the MultiIndex to the frame. Use method='ffill' to populate the starting values down the frame.

days = 10
midx = pd.MultiIndex.from_product((df['Group'], range(days + 1)),
                                  names=['Group', 'Day'])
df_rolling = (
    df.assign(i=0)
        .set_index(['Group', 'i'])
        .reindex(midx, method='ffill')
        .reset_index()
)

This will create a DataFrame like:

df_rolling:

   Group  Day  Minimum  Maximum
0      A    0       50      100
1      A    1       50      100
2      A    2       50      100
3      A    3       50      100
4      A    4       50      100
5      A    5       50      100
6      A    6       50      100
7      A    7       50      100
8      A    8       50      100
9      A    9       50      100
10     A   10       50      100
11     B    0       30      150
12     B    1       30      150
13     B    2       30      150
14     B    3       30      150
15     B    4       30      150
16     B    5       30      150
17     B    6       30      150
18     B    7       30      150
19     B    8       30      150
20     B    9       30      150
21     B   10       30      150

Then use math operations on this new frame:

df_rolling['Minimum'] = (
        df_rolling['Minimum'] +
        (df_rolling['Day'] / days) *
        (df_rolling['Maximum'] - df_rolling['Minimum']) /
        2
)

df_rolling:

   Group  Day  Minimum  Maximum
0      A    0     50.0      100
1      A    1     52.5      100
2      A    2     55.0      100
3      A    3     57.5      100
4      A    4     60.0      100
5      A    5     62.5      100
6      A    6     65.0      100
7      A    7     67.5      100
8      A    8     70.0      100
9      A    9     72.5      100
10     A   10     75.0      100
11     B    0     30.0      150
12     B    1     36.0      150
13     B    2     42.0      150
14     B    3     48.0      150
15     B    4     54.0      150
16     B    5     60.0      150
17     B    6     66.0      150
18     B    7     72.0      150
19     B    8     78.0      150
20     B    9     84.0      150
21     B   10     90.0      150

Complete Working Example:

import pandas as pd

df = pd.DataFrame({
    'Group': ['A', 'B'],
    'Minimum': [50, 30],
    'Maximum': [100, 150]
})

days = 10
midx = pd.MultiIndex.from_product((df['Group'], range(days + 1)),
                                  names=['Group', 'Day'])
df_rolling = (
    df.assign(i=0)
        .set_index(['Group', 'i'])
        .reindex(midx, method='ffill')
        .reset_index()
)

df_rolling['Minimum'] = (
        df_rolling['Minimum'] +
        (df_rolling['Day'] / days) *
        (df_rolling['Maximum'] - df_rolling['Minimum']) /
        2
)

print(df_rolling)

Upvotes: 1

Related Questions