Adam_G
Adam_G

Reputation: 7879

Convert Pandas DataFrame Column From String to Int Based on Conditional

I have a dataframe that looks like

df

viz  a1_count  a1_mean     a1_std
n         3        2   0.816497
y         0      NaN        NaN 
n         2       51  50.000000

I want to convert the "viz" column to 0 and 1, based on a conditional. I've tried:

df['viz'] = 0 if df['viz'] == "n" else 1

but I get:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Upvotes: 34

Views: 101652

Answers (2)

Display name
Display name

Reputation: 926

From @TMWP's comment above:

pd.to_numeric(myDF['myDFCell'], errors='coerce')

It works like a charm and is a quick and simple one liner

Upvotes: 4

EdChum
EdChum

Reputation: 394031

You're trying to compare a scalar with the entire series which raise the ValueError you saw. A simple method would be to cast the boolean series to int:

In [84]:
df['viz'] = (df['viz'] !='n').astype(int)
df

Out[84]:
   viz  a1_count  a1_mean     a1_std
0    0         3        2   0.816497
1    1         0      NaN        NaN
2    0         2       51  50.000000

You can also use np.where:

In [86]:
df['viz'] = np.where(df['viz'] == 'n', 0, 1)
df

Out[86]:
   viz  a1_count  a1_mean     a1_std
0    0         3        2   0.816497
1    1         0      NaN        NaN
2    0         2       51  50.000000

Output from the boolean comparison:

In [89]:
df['viz'] !='n'

Out[89]:
0    False
1     True
2    False
Name: viz, dtype: bool

And then casting to int:

In [90]:
(df['viz'] !='n').astype(int)

Out[90]:
0    0
1    1
2    0
Name: viz, dtype: int32

Upvotes: 32

Related Questions