Michal
Michal

Reputation: 73

How to replace <NA> values based on another column main value in pandas

Problem with pandas

I have a following dataframe:

      product code trends_vs  trends_100_0
1   aaaaaaa   it       150           100
6   aaaaaaa   pl      <NA>            90
11  aaaaaaa   ro      <NA>            82
15  aaaaaaa   sk      <NA>            13
21   bbbbb   ro       260           100
25   bbbbb   cz      <NA>            62
29   bbbbb   sk      <NA>            42

I want to calculate values from trends_vs column the following way:

each product in a ‘product’ column have ONLY one trends_vs value which corresponds with trends_100_0 - 100 value.

And now to get trends_vs value for second row we have to do 150 * (90/100)

The output and logic should look like this:

      product code trends_vs  trends_100_0
1   aaaaaaa   it       150           100
6   aaaaaaa   pl      135            90      ---> 150 * (90/100)
11  aaaaaaa   ro      123            82    ---> 150 * (82/100)
15  aaaaaaa   sk      19.5            13  ---> 150 * (13/100)
21   bbbbb   ro       260           100
25   bbbbb   cz      161.2            62  ---> 260 * (62/100) # 260 instead of 150 because its a different product 'bbbbb'
29   bbbbb   sk      109.2            42 ---> 260 * (42/100)

I tried with groupby or map but something not work as expected. I have a solution when each column should corresponds with each other in a row but in this scenario where we should calculate based on one main value for each product?

Upvotes: 1

Views: 173

Answers (3)

Corralien
Corralien

Reputation: 120499

Another method without groupby:

df['trends_vs'] = (df.sort_values(['product', 'trends_100_0'])['trends_vs'].bfill()
                       * df['trends_100_0'] / 100)
print(df)

# Output
    product code  trends_vs  trends_100_0
1   aaaaaaa   it      150.0           100
6   aaaaaaa   pl      135.0            90
11  aaaaaaa   ro      123.0            82
15  aaaaaaa   sk       19.5            13
21    bbbbb   ro      260.0           100
25    bbbbb   cz      161.2            62
29    bbbbb   sk      109.2            42

Upvotes: 4

ArchAngelPwn
ArchAngelPwn

Reputation: 3046

This will allow you to find the trends_vs based on product and the trends_100_0 based on the results

df['trends_vs'] = df.groupby('product')['trends_vs'].transform(max)
df['trends_vs'] = df['trends_vs'] * (df['trends_100_0']/100)
df

Upvotes: 0

Umar.H
Umar.H

Reputation: 23099

use .groupby.transform and then a simple .mul with .div

we can use .fillna to target your rows.

df["trends_vs"] = df["trends_vs"].fillna(
df.groupby("product")["trends_vs"]
.transform("min")
.mul((df["trends_100_0"].div(100)))
)


    product code  trends_vs  trends_100_0
1   aaaaaaa   it      150.0           100
6   aaaaaaa   pl      135.0            90
11  aaaaaaa   ro      123.0            82
15  aaaaaaa   sk       19.5            13
21    bbbbb   ro      260.0           100
25    bbbbb   cz      161.2            62
29    bbbbb   sk      109.2            4

Upvotes: 0

Related Questions