Reputation: 25
I have a dataframe, where one column contains a tweet. I want to get the rows of this dataframe, where this "tweet" column contains any words that start with "#" and have 2 or more capital letters.
So for example, I want to retreive such rows:
However, these would not classify under my conditions:
Upvotes: 1
Views: 1979
Reputation: 120459
Try str.contains
:
df['Match'] = df['tweet'].str.contains(r'#[A-Z][^A-Z#]*[A-Z]')
print(df)
# Output
tweet Match
0 I love coding in python. #CodingSession True
1 I am not scared of #COVID19 anymore True
2 I love coding in python. #Coding #Session False
3 I love coding in python. #Codingsession False
4 I am not scared of #Covid19 anymore. False
[A-Z]
for a capital letter[^A-Z#]*
for anything else except capital letter or #[A-Z]
and again a capital letterUpvotes: 2