JohnDoe
JohnDoe

Reputation: 91

regular expressions emoticons

I have data split into fileids. I am trying to go through the data per fileid and search for emoticons :( and :) as defined by the regex. If an emoticon is found I need to retain the information a) the emoticon was found b) in this fileid. When I run this piece of script and print the emoticon dictionary I get 0 as a value. How is this possible? I am a beginner.

emoticon = 0
for fileid in corpus.fileids():
    m = re.search('^(:\(|:\))+$', fileid)
    if m is not None:
        emoticon +=1

Upvotes: 0

Views: 1484

Answers (1)

vroomfondel
vroomfondel

Reputation: 3106

It looks to me like your regex is working, and that m should indeed not be None.

>>> re.search('^(:\(|:\))+$', ':)').group()
':)'
>>> re.search('^(:\(|:\))+$', ':)').group()
':)'
>>> re.search('^(:\(|:\))+$', ':):(').group()
':):('
>>> re.search('^(:\(|:\))+$', ':)?:(').group()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'group'

However, a few things are questionable to me.

  • this will only match strings that are 100% emoticons
  • is fileid really what you're searching?

Upvotes: 1

Related Questions