Reputation: 23
Total | Male | Female |
---|---|---|
24 | NAN | NAN |
45 | 21 | 24 |
36 | NAN | NAN |
50 | NAN | NAN |
This is my dataframe with 3 columns in pandas. The "Male" and "Female" columns contains null values.
So, is there any method to divide each value of "Total" and feed into respective "Male" and "Female" rows?
My desired output would be this:
Total | Male | Female |
---|---|---|
24 | 12 | 12 |
45 | 21 | 24 |
36 | 18 | 18 |
50 | 25 | 25 |
Upvotes: 2
Views: 81
Reputation: 75080
This is a good example of df.mask
:
v = df['Total'].div(2)
df[['Male','Female']] = df[['Male','Female']].mask(df.isna(),v,axis=0)
#or for new df: out = df.assign(**df[['Male','Female']].mask(df.isna(),v,axis=0))
Total Male Female
0 24 12 12
1 45 21 24
2 36 18 18
3 50 25 25
Upvotes: 5
Reputation: 35626
One approach would be to divide Total
by 2 then ffill
across rows to fill the missing values:
cols = ['Male', 'Female']
df[cols] = df.assign(Total=df['Total'] / 2).ffill(axis=1)[cols]
Or by creating a new DataFrame to use fillna
with. In case there are rows with some NaN and some valid values where ffill
may propagate incorrect values:
cols = ['Male', 'Female']
x = df['Total'] / 2
df = df.fillna(pd.DataFrame({c: x for c in cols}))
df
:
Total Male Female
0 24 12.0 12.0
1 45 21.0 24.0
2 36 18.0 18.0
3 50 25.0 25.0
Setup:
import numpy as np
import pandas as pd
df = pd.DataFrame({
'Total': [24, 45, 36, 50],
'Male': [np.nan, 21, np.nan, np.nan],
'Female': [np.nan, 24, np.nan, np.nan]
})
Upvotes: 2