Reputation: 43
I am trying to remove certain characters from strings within certain columns in a pandas dataframe. I am doing all of this within a for loop so I would like to use an if statement within the loop to perform the actions on all 'object' dtype columns.
for col in pitchtype :
pitchtype[col] = pitchtype[col].replace(np.nan,0)
if pitchtype[col].dtype == 'object':
pitchtype[col] = pitchtype[col].map(lambda x: x.replace(' %',''))
if there a way to make that condition in the if statement?
edit: added my DF below. Basically the columns with % in the header have '%' symbols in the values which are preventing them from being float. I am trying to remove the '%'s and change the columns to type float afterwards.
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 264 entries, 0 to 263
Data columns (total 18 columns):
Name 264 non-null object
Team 264 non-null object
FB% 264 non-null object
FBv 264 non-null float64
SL% 264 non-null object
SLv 264 non-null float64
CT% 264 non-null object
CTv 264 non-null float64
CB% 264 non-null object
CBv 264 non-null float64
CH% 264 non-null object
CHv 264 non-null float64
SF% 264 non-null object
SFv 264 non-null float64
KN% 264 non-null object
KNv 264 non-null float64
XX% 264 non-null object
playerid 264 non-null int64
dtypes: float64(7), int64(1), object(10)
memory usage: 37.2+ KB
Upvotes: 4
Views: 11980
Reputation: 164773
You can use pd.DataFrame.select_dtypes
and pd.Series.str.rstrip
:
for col in df.select_dtypes(['object']):
df[col] = pd.to_numeric(df[col].str.rstrip('%'), errors='coerce')
The conversion to float
is performed by pd.to_numeric
. errors='coerce'
gives NaN
for non-convertible values.
Upvotes: 4
Reputation: 5958
I think this maybe what you're looking for, checking individual object to see if they're string.
if pitchtype[col].dtype == object: # No quotes around it!
pitchtype[col] = pitchtype[col].map(lambda x: x.replace(' %','') if type(x) == str else x)
Upvotes: 5