Reputation: 11
I want to remove all numbers within the entries of a certain column in a Python pandas dataframe. Unfortunately, commands like .join()
and .find()
are not iterable (when I define a function to iterate on the entries, it gives me a message that floating variables do not have .find
and .join
attributes). Are there any commands that take care of this in pandas?
def remove(data):
for i in data if not i.isdigit():
data=''
data=data.join(i)
return data
myfile['column_name']=myfile['column_name'].apply(remove())
Upvotes: 0
Views: 5475
Reputation: 153510
Or look at using pd.to_numeric
with errors='coerce'
to cast the column as numeric and eliminate non-numeric values:
Using @Raidex setup:
s = pd.DataFrame({'x':['p','2','3','d','f','0']})
pd.to_numeric(s['x'], errors='coerce')
Output:
0 NaN
1 2.0
2 3.0
3 NaN
4 NaN
5 0.0
Name: x, dtype: float64
EDIT to handle either situation.
s['x'].where(~s['x'].str.isdigit())
Output:
0 p
1 NaN
2 NaN
3 d
4 f
5 NaN
Name: x, dtype: object
OR
s['x'].where(s['x'].str.isdigit())
Output:
0 NaN
1 2
2 3
3 NaN
4 NaN
5 0
Name: x, dtype: object
Upvotes: 0
Reputation: 7510
You can remove all numbers like this:
import pandas as pd
df = pd.DataFrame ( {'x' : ['1','2','C','4']})
df[ df["x"].str.isdigit() ] = "NaN"
Upvotes: 2
Reputation: 457
Impossible to know for sure without a data sample, but your code implies data
contains strings since you call isdigit
on the elements.
Assuming the above, there are many ways to do what you want. One of them is conditional list comprehension:
import pandas as pd
s = pd.DataFrame({'x':['p','2','3','d','f','0']})
out = [ x if x.isdigit() else '' for x in s['x'] ]
# Output: ['', '2', '3', '', '', '0']
Upvotes: 0