Ash
Ash

Reputation: 3550

Regex that returns hashtags with hash sign but excludes @ mentions

I have a regex that returns the words (excludes @mentions includes hashtags but removes the hash sign #)

import re
pattern=r'(?u)(?<![@])\b\w\w+\b'
pattern=re.compile(pattern)
pattern.findall('this is a tweet #hashtag @mention')

This returns

['this', 'is', 'tweet', 'hashtag']

What I need is a modification to this regex that returns the hash sign with hashtag so it should return:

['this', 'is', 'tweet', '#hashtag']

Note that my question is different from returning just @mentions and #hashtags I want both regular words and hashtags but I don't want @mentions.

Upvotes: 0

Views: 1308

Answers (1)

user7823241
user7823241

Reputation: 240

Adding '#?' to the pattern will let it match words that start with 0 or 1 hash symbols.

import re
pattern=r'(?u)(?<![@])#?\b\w\w+\b'
pattern=re.compile(pattern)
results = pattern.findall('this is a tweet #hashtag @mention')
print(results)

Resulting in:

['this', 'is', 'tweet', '#hashtag']

Upvotes: 2

Related Questions