Reputation: 554
I am working with a DataFrame that looks like this:
I wanted to create a new column 'Named' in order to use the categorical column 'Name' in linear regression. I did the following to accomplish that goal:
def named(name):
if name == 'UNNAMED':
return 0
else:
return 1
df['Named'] = df['Name'].apply(lambda name: named(name))
However, that gives a column that consists only of the value 1
The function works on its own, but for some reason doesn't behave when used in the DataFrame.apply method.
Upvotes: 0
Views: 524
Reputation: 5437
df.assign(Named = lambda df: (df["Name"]!='UNNAMED').astype(int))
named
. This object clearly is not equal to UNNAMED
, hence, you get the 1. Did you try applymap
? This works for me as you desireMoreover, on a recent pandas version, I can't reproduce your example, I'm seeing this error message:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Upvotes: 1
Reputation: 10624
The following should work:
df['Named']=[i for i in map(lambda x: 0 if x.strip()=='UNNAMED' else 1, df['Name'])]
Upvotes: 1