MM55
MM55

Reputation: 65

How to create a new column based on certain values in another column

For each day, I have different colors which are used. I need to create a new column which looks at the ratio of users to the first day for each color.

For example, for color blue, the earliest date is 2020-01-01 and the corresponding 'users' value is 100. So, for 2020-01-02, the value I want should be 102/100 = 1.02

raw_data = [
        {'date': '2020-01-01', 'color': 'blue', 'users': 100},
        {'date': '2020-01-02', 'color': 'blue', 'users': 102},
        {'date': '2020-01-03', 'color': 'blue', 'users': 104},
        {'date': '2020-01-04', 'color': 'blue', 'users': 98},
        {'date': '2020-01-02', 'color': 'red', 'users': 100},
        {'date': '2020-01-03', 'color': 'red', 'users': 107},
        {'date': '2020-01-04', 'color': 'red', 'users': 114},
        {'date': '2020-01-05', 'color': 'red', 'users': 150},
    ]

So far, I know I can get the min date for each color using the code below, but not sure what to do next

grouped = df.groupby('color')['date']
min = grouped.min()

Upvotes: 2

Views: 48

Answers (1)

Shubham Sharma
Shubham Sharma

Reputation: 71707

sort the values of the dataframe on Date then groupby on color and transform the column users using first, then use Series.div to divide users by this transformed column to get the ratio:

df['date'] = pd.to_datetime(df['date'])
df['ratio'] = df['users'].div(
    df.sort_values('date').groupby('color')['users'].transform('first')
)

        date color  users  ratio
0 2020-01-01  blue    100   1.00
1 2020-01-02  blue    102   1.02
2 2020-01-03  blue    104   1.04
3 2020-01-04  blue     98   0.98
4 2020-01-02   red    100   1.00
5 2020-01-03   red    107   1.07
6 2020-01-04   red    114   1.14
7 2020-01-05   red    150   1.50

Upvotes: 3

Related Questions