Reputation: 25

How can I extract hashtags from string?

I need to extract the "#" from a function that receives a string. Here's what I've done:

def hashtag(str):
lst = []
for i in str.split():
    if i[0] == "#":
        lst.append(i[1:])
return lst

My code does work, but it splits words. So for the example string: "Python is #great #Computer#Science" it'll return the list: ['great', 'Computer#Science'] instead of ['great', 'Computer', 'Science'].

Without using RegEx please.

Upvotes: 0

Answers (4)

Alexander Sica

Reputation: 51

When you split the string using default separator (space), you get the following result:

['Python', 'is', '#great', '#Computer#Science']

You can make a replace (adding a space before a hashtag) before splitting

def hashtag(str):
    lst = []
    str = str.replace('#', ' #')
    for i in str.split():
        if i[0] == "#":
            lst.append(i[1:])
    return lst

Upvotes: 0

Lior Cohen

Reputation: 5735

split by #
take all tokens except the first one
strip spaces

s = "Python is #great #Computer#Science"
out = [w.split()[0] for w in s.split('#')[1:]]
out
['great', 'Computer', 'Science']

Upvotes: 1

Prune

Reputation: 77837

Split into words, and then filter for the ones beginning with an octothorpe (hash).

[word for word in str.replace("#", " #").split()
    if word.startswith('#')
]

The steps are

Insert a space in front of each hash, to make sure we separate on them
Split the string at spaces
Keep the words that start with a hash.

Result:

['#great', '#Computer', '#Science']

Upvotes: 3

ThePyGuy

Reputation: 18406

You can first try to find the firsr index where # occurs and split the slice on #

text = 'Python is #great #Computer#Science'
text[text.find('#')+1:].split('#')
Out[214]: ['great ', 'Computer', 'Science']

You can even use strip at last to remove unnecessary white space.

[tag.strip() for tag in text[text.find('#')+1:].split('#')]
Out[215]: ['great', 'Computer', 'Science']

Upvotes: 2

How can I extract hashtags from string?

Answers (4)

Related Questions