Shalin
Shalin

Reputation: 135

Remove rows from pandas dataframe if string has 'only numbers'

I am looking to remove all rows from the df that have ONLY numbers in the string

Here is an extract of the dataframe

                         qid    question_stemmed                                   target   question_length total_words
149952  1d53c9c017999b4f77e2    8430397824532987451912384179815150754023741609...   0              241              3

Is there a way i can do that?

I tried the below, but it will remove all rows that have numbers in the string (along with any other datatype). However, i am looking to see if i can remove all 'numeric ONLY' rows.

df['question_stemmed'] = df[df['question_stemmed'].str.contains(r'[^a-z]')]

Appreciate any help here

Upvotes: 2

Views: 4123

Answers (1)

cs95
cs95

Reputation: 402463

If we're only worrying about ASCII digits 0-9:

df = df[~df['question_stemmed'].str.isdigit()]

If we need to worry about unicode or digits in other languages:

df = df[~df['question_stemmed'].str.isnumeric()]

Pandas methods internally call the corresponding python methods. See What's the difference between str.isdigit, isnumeric and isdecimal in python? for an explanation of how these functions work.

Upvotes: 5

Related Questions