Reputation: 135
I am looking to remove all rows from the df that have ONLY numbers in the string
Here is an extract of the dataframe
qid question_stemmed target question_length total_words
149952 1d53c9c017999b4f77e2 8430397824532987451912384179815150754023741609... 0 241 3
Is there a way i can do that?
I tried the below, but it will remove all rows that have numbers in the string (along with any other datatype). However, i am looking to see if i can remove all 'numeric ONLY' rows.
df['question_stemmed'] = df[df['question_stemmed'].str.contains(r'[^a-z]')]
Appreciate any help here
Upvotes: 2
Views: 4123
Reputation: 402463
If we're only worrying about ASCII digits 0-9:
df = df[~df['question_stemmed'].str.isdigit()]
If we need to worry about unicode or digits in other languages:
df = df[~df['question_stemmed'].str.isnumeric()]
Pandas methods internally call the corresponding python methods. See What's the difference between str.isdigit, isnumeric and isdecimal in python? for an explanation of how these functions work.
Upvotes: 5