Reputation: 149
I'm new to python and pandas. I have a dataframe that looks as follows:
ID NAME
0 0000001 Apple
1 0000002 35
2 0000003 Grape
3 0000004 22
4 0000005 Banana
5 0000006 12
My goal is to replace the fields with numbers in the NAME column with 'Unknown'
So far, I have tried the following:
out['NAME'] = out.apply(lambda x: x['NAME'].replace(x['NAME'], 'Unknown'))
But it wouldn't seem to replace it and gives me a KeyError: ('NAME', 'occurred at index ID')
Ultimately I am expecting an output as follows:
ID NAME
0 0000001 Apple
1 0000002 Unkown
2 0000003 Grape
3 0000004 Unkown
4 0000005 Banana
5 0000006 Unkown
Upvotes: 1
Views: 185
Reputation: 31
You can also just create a function in python
def replaceName(Name):
if Name.isnumeric():
return "Unknown"
else:
return Name
and use the map function:
df["Name"] = df["Name"].map(lambda x: replaceName(x))
Upvotes: 0
Reputation: 4761
You can use pandas.to_numeric
to determinate which rows can be converted to numeric:
>>> df.loc[~pd.to_numeric(df["NAME"], errors = "coerce").isnull(), "NAME"] = "unknown"
>>> df
ID NAME
0 1 Apple
1 2 unknown
2 3 Grape
3 4 unknown
4 5 Banana
5 6 unknown
With errors = "coerce"
the invalid parsing will be set to NaN
.
Upvotes: 0
Reputation: 26676
out['NAME']=np.where(out.NAME.str.contains('\d'),'unknown',out.NAME)
Upvotes: 0
Reputation: 15872
Use pandas.Series.str.isnumeric
:
>>> out.loc[out.NAME.str.isnumeric(), 'NAME'] = 'Unknown'
>>> out
ID NAME
0 0000001 Apple
1 0000002 Unknown
2 0000003 Grape
3 0000004 Unknown
4 0000005 Banana
5 0000006 Unknown
Upvotes: 1