Why does this regex match only once?

Question

I want to extract the Chinese Weibo username. So I use this code:

def atExtractor(sentence):
    return re.findall("@.*\s", sentence, re.I)

And then I extract this sentence:

atExtractor(u"@中国联通网上营业厅 @北京地铁 北京地铁10号线，从惠新西街南口到海淀黄庄")

It get:

[u'@中国联通网上营业厅 @北京地铁 ']

Why the regex only get one match but not two? And the same problem happens when I want to extract hashtag:

 def activityExtractor(sentence):
        return re.findall("#.*#", sentence, re.I)
 activityExtractor(u"#中国联通网上营业厅# #北京地铁# 北京地铁10号线")

It get:

[u'#中国联通网上营业厅# #北京地铁# ']

Avinash Raj · Accepted Answer

Because your pattern is greedy.

re.findall("@.*?(?=\s)", sentence, re.I)

or

re.findall(r"@\S*", sentence, re.I)

\S* should match zero or more non-space characters.

Answers (1)