Reputation: 83
I have a dataframe where a column is named as USER_ID. Ideally USER_ID should be of numerical No but the data that is coming from source is having typically some bad records which i want to discard in my final dataframe. For example the values in the column are like below
DF
USER_ID |
---|
23456 |
1236 |
NO_NULL |
FBA56X%^ |
and the final dataframe should be
DF1
USER_ID |
---|
23456 |
1236 |
The code i am using to clean it is as below
DF1 = DF[np.isfinite(pd.to_numeric(DF.USER_ID,errors='coerce))]
But it seems this code is not acting properly. Any suggestion will be appreciated.
Upvotes: 0
Views: 7094
Reputation: 14949
You can use isnumeric()
to filter
the numeric values:
df1 = df.loc[df.USER_ID.str.isnumeric()]
Upvotes: 2