Jan Marczak
Jan Marczak

Reputation: 25

How to check if any word in a string has special characters and conditions in Pandas

I have a dataframe, where one column contains a tweet. I want to get the rows of this dataframe, where this "tweet" column contains any words that start with "#" and have 2 or more capital letters.

So for example, I want to retreive such rows:

However, these would not classify under my conditions:

Upvotes: 1

Views: 1979

Answers (1)

Corralien
Corralien

Reputation: 120459

Try str.contains:

df['Match'] = df['tweet'].str.contains(r'#[A-Z][^A-Z#]*[A-Z]')
print(df)

# Output
                                       tweet  Match
0    I love coding in python. #CodingSession   True
1        I am not scared of #COVID19 anymore   True
2  I love coding in python. #Coding #Session  False
3    I love coding in python. #Codingsession  False
4       I am not scared of #Covid19 anymore.  False
  • [A-Z] for a capital letter
  • [^A-Z#]* for anything else except capital letter or #
  • [A-Z] and again a capital letter

Regex101

Upvotes: 2

Related Questions