Reputation: 2189
message = "@me this is nice. @you: let me #help you. #anyone> #Python, #All."
message.split()
gives "
['@me', 'this', 'is', 'nice.', '@you:', 'let', 'me', '#help', 'you.', '#anyone>', '#Python,', '#All.']
But what I want is
['@me', 'this', 'is', 'nice.', '@you', 'let', 'me', '#help', 'you.', '#anyone', '#Python', '#All']
. Without the :
, .
, >
or any other symbol. I just want words alone.
startswith('#')
should return
['#help', '#anyone', '#Python', #All]
and the hashtag_links
will then return
["<a href='hashtags\help'>#help</a>", "<a href='hashtags\anyone'>#anyone</a>", ...]
What I want to do is to be able to replace hashtags in message
with their equivalent in hashtag_links
so that they can be clickable when rendered in HTML.
Upvotes: 0
Views: 223
Reputation: 1610
You can do this quite easily with list comprehensions:
mylist = [i.rstrip(":;") for i in message.split() if i] # remove blanks
hashtagged = [i for i in mylist if i.startswith("#")]
Upvotes: 1
Reputation: 67968
[\s.:>,]
You can split by this.Use re.split
.Remove the blank groups.
See demo.
http://regex101.com/r/sU3fA2/11
Upvotes: 1