aristotle29
aristotle29

Reputation: 79

update a column based on another column in pandas

I've a sample df

city        population          is_identified
Newyork       100000               yes
Buffalo       200000               yes
WashingtonDC  150000               

How can I remove the values in population if is_identified is empty or np.Nan?

The output would like

city        population          is_identified
Newyork       100000               yes
Buffalo       200000               yes
WashingtonDC    

Upvotes: 1

Views: 1898

Answers (1)

jezrael
jezrael

Reputation: 862431

You can chain 2 conditions - Series.isna with testing empty string and then to DataFrame.loc:

df.loc[df['is_identified'].isna() | df['is_identified'].eq(''), 'population'] = ''

Or replace empty string and test misisng value or replace nan and test empty strings:

df.loc[df['is_identified'].replace('', np.nan).isna(), 'population'] = ''
df.loc[df['is_identified'].fillna('').eq(''), 'population'] = ''

If need processing column population by some arithmetic operation later better is set values to NaNs:

df.loc[df['is_identified'].isna() | df['is_identified'].eq(''), 'population'] = np.nan

#if need integers with missing values
#df['population'] = df['population'].astype('Int64')

Upvotes: 2

Related Questions