Naik
Naik

Reputation: 1255

Data frame (Pandas) filling Missing Values

We are asked to fill missing values in a column of a data frame (let's say df['A']) based on the following assumptions: 1- If the value of df['B'] for the same row is greater than 1000, use 0. 2- Otherwise, use the mean of df['A']

I used the following code and it worked well.

mean_value = df['A'].mean()
df['A'].loc[(df['A'].isna()) & (df['B] > 1000)] = 0
df['A'].fillna(mean_value, inplace = True)

But you can see that two lines of code are used for filling the null values. Is there any method to replace it by just one line?

Upvotes: 0

Views: 96

Answers (4)

Andy L.
Andy L.

Reputation: 25269

You may try this arithmetic way although one-liner in this case just makes it hard to read. Since you edited your question from 2000 to 0, there is no need the addition. So, the answer is

df['A'] = df.A.fillna((df['B'] <= 1000) * df.A.mean())

Upvotes: 1

moys
moys

Reputation: 8033

May be you can use this

check1 = df['A'].isna()
check2 = (df['A'].isna()) & (df['B'] > 1000)
df['A'] = np.where(check1 , np.where(check2,2000,df['A'].mean()), df['A'])

Example Input

      A     B
0   5.0     500
1   NaN     2000
2   3.0     1500
3   4.0     1100
4   NaN     7

Example Output

      A     B
0   5.0     500
1   2000.0  2000
2   3.0     1500
3   4.0     1100
4   4.0     7

Upvotes: 1

BENY
BENY

Reputation: 323386

You can check np.select

con1=(df['A'].isna()) & (df['B'] > 1000)
con2=df['A'].isna()
df['A']=np.select([con1,con2],[0,df['A'].mean()],default=df.A)

Upvotes: 1

Naik
Naik

Reputation: 1255

I used the following line of code and it worked but I still believe there should be a more gentle way to solve this problem.

df['A'] = df.apply(lambda x: x['A'] if not(np.isnan(x['A'])) else (0 if x['B'] > 1000 else mean_value), axis = 1)

Any idea?

Upvotes: 0

Related Questions