Alex
Alex

Reputation: 1537

Replace special characters from list in python

How do I replace special characters (emoticons) with a given feature.

For example

emoticons = \
    [   ('__EMOT_SMILEY',   [':-)', ':)', '(:', '(-:', ] )  ,\
        ('__EMOT_LAUGH',        [':-D', ':D', 'X-D', 'XD', 'xD', ] )    ,\
        ('__EMOT_LOVE',     ['<3', ':\*', ] )   ,\
        ('__EMOT_WINK',     [';-)', ';)', ';-D', ';D', '(;', '(-;', ] ) ,\
        ('__EMOT_FROWN',        [':-(', ':(', ] )   ,\
        ('__EMOT_CRY',      [':,(', ':\'(', ':"(', ':(('] ) ,\
    ]

msg = 'I had a beautiful day :)'

output desired

>> I had a beautiful day __EMOT_SMILEY

I know how to do it with a dict, but here I have multiple values associated to each feature

The following code will not work in this case

for emote, replacement in emoticons.items():
  msg = msg.replace(emote, replacement)

Upvotes: 0

Views: 292

Answers (7)

Eugene Yarmash
Eugene Yarmash

Reputation: 149736

You could use a dictionary and a regex:

import re

def replace(msg, emoticons):
    d = {r: emote for emote, replacement in emoticons for r in replacement}
    pattern = "|".join(map(re.escape, d))
    msg = re.sub(pattern, lambda match: d[match.group()], msg)
    return msg

print(replace(msg, emoticons))  # I had a beautiful day __EMOT_SMILEY

Upvotes: 2

javidcf
javidcf

Reputation: 59691

Try this instead:

emoticons = [
    ('__EMOT_SMILEY', [':-)', ':)', '(:', '(-:',]),
    ('__EMOT_LAUGH',  [':-D', ':D', 'X-D', 'XD', 'xD',]),
    ('__EMOT_LOVE',   ['<3', ':\*',]),
    ('__EMOT_WINK',   [';-)', ';) ', ';-D', ';D', '(;', '(-;',]),
    ('__EMOT_FROWN',  [':-(', ':(', '(:', '(-:',]),
    ('__EMOT_CRY',    [':,(', ':\'(', ':"(', ':((',]),
]

msg = 'I had a beautiful day :)'

for key, replaceables in dict(emoticons).items():
  for replaceable in replaceables:
    msg = msg.replace(replaceable, key)

print(msg)
>>> I had a beautiful day __EMOT_SMILEY

Upvotes: 0

Adam H
Adam H

Reputation: 11

You can try using a dict, This should work as long as you only have 2 or 3 chars in your emoticons and the person uses a space... Im sure you can make it more robust but this will work for now.

emoticons = {
    '__EMOT_SMILEY': {':-)', ':)', '(:', '(-:'},
    '__EMOT_LAUGH' : {':-D', ':D', 'X-D', 'XD', 'xD'},
    '__EMOT_LOVE' : {'<3', ':\*'},
    '__EMOT_WINK' :{';-)', ';)', ';-D', ';D', '(;', '(-;'},
    '__EMOT_FROWN' : {':-(', ':(', '(:', '(-:'},
    '__EMOT_CRY' : {':,(', ':\'(', ':"(', ':(('}
        }

msg = 'I had a beautiful day :,('
img = msg[-3]
if img[0]==' ':
    img = msg[-2:]
else:
    img = msg[-3:]

for k, v in emoticons.items():
    if img in v:
        print(msg[:-3], k)

Upvotes: 0

bendl
bendl

Reputation: 1630

There are plenty of answers giving you exactly what you asked for, but sometimes I think exactly what you asked for isn't the best solution. Like tobias_k said, the cleanest solution is to map many keys to the same value, essentially "reversing" your dictionary:

emoticons = \
    [   ('__EMOT_SMILEY',   [':-)', ':)', '(:', '(-:', ] )  ,\
        ('__EMOT_LAUGH',        [':-D', ':D', 'X-D', 'XD', 'xD', ] )    ,\
        ('__EMOT_LOVE',     ['<3', ':\*', ] )   ,\
        ('__EMOT_WINK',     [';-)', ';)', ';-D', ';D', '(;', '(-;', ] ) ,\
        ('__EMOT_FROWN',        [':-(', ':(', '(:', '(-:', ] )  ,\
        ('__EMOT_CRY',      [':,(', ':\'(', ':"(', ':(('] ) ,\
    ]

emote_dict = {emote: name for name, vals in emoticons for emote in vals}

The above code reverses the dictionary, so now it can be used like this:

>>>print(emote_dict[':)'])
_EMOT_SMILY

Upvotes: 0

zipa
zipa

Reputation: 27869

This oughta do it:

emoticons = [   ('__EMOT_SMILEY',   [':-)', ':)', '(:', '(-:', ] ),
        ('__EMOT_LAUGH',    [':-D', ':D', 'X-D', 'XD', 'xD', ] ),
        ('__EMOT_LOVE',     ['<3', ':\*', ] ),
        ('__EMOT_WINK',     [';-)', ';)', ';-D', ';D', '(;', '(-;', ] ),
        ('__EMOT_FROWN',        [':-(', ':(', '(:', '(-:', ] ),
        ('__EMOT_CRY',      [':,(', ':\'(', ':"(', ':(('] )
    ]

emoticons = dict(emoticons)    
emoticons = {v: k for k in emoticons for v in emoticons[k]}

msg = 'I had a beautiful day :)'

for item in emoticons:
    if item in msg:
        msg = msg.replace(item, emoticons[item])

So, you crate a dict, invert it and replace all the emoticons that exist in sentence.

Upvotes: 1

Ma0
Ma0

Reputation: 15204

How about this:

emoticons = [('__EMOT_SMILEY',   [':-)', ':)', '(:', '(-:']),
             ('__EMOT_LAUGH',    [':-D', ':D', 'X-D', 'XD', 'xD']),
             ('__EMOT_LOVE',     ['<3', ':\*']),
             ('__EMOT_WINK',     [';-)', ';)', ';-D', ';D', '(;', '(-;']),
             ('__EMOT_FROWN',    [':-(', ':(', '(:', '(-:']),
             ('__EMOT_CRY',      [':,(', ':\'(', ':"(', ':(('])]

msg = 'I had a beautiful day :)'

grabs = set([x for _, y in emoticons for x in y[1]])

for word in [x for x in msg.split() if all(y in grabs for y in x)]:
    for emot_code, search_patterns in emoticons:
        if word in search_patterns:
            msg = msg.replace(word, emot_code)
print(msg)  # I had a beautiful day __EMOT_SMILEY

Instead of trying to find any of the emoticons in the msg to replace them, it first searches for substrings that might be emoticons and tries to replaces those only.

That said, it does fail for cases with punctuation right after or before the emoticons; e.g., "I had a beautiful day :)."

So all in all.. "__EMOT_FROWN"

Upvotes: 0

Awaish Kumar
Awaish Kumar

Reputation: 557

emoticons = [   ('__EMOT_SMILEY',   [':-)', ':)', '(:', '(-:', ] )  ,
    ('__EMOT_LAUGH',        [':-D', ':D', 'X-D', 'XD', 'xD', ] )    ,
    ('__EMOT_LOVE',     ['<3', ':\*', ] )   ,
    ('__EMOT_WINK',     [';-)', ';)', ';-D', ';D', '(;', '(-;', ] ) ,
    ('__EMOT_FROWN',        [':-(', ':(', '(:', '(-:', ] )  ,
    ('__EMOT_CRY',      [':,(', ':\'(', ':"(', ':(('] ) ,
]


msg = 'I had a beautiful day :)'

for emote, replacement in emoticons:
     for symbol in replacement:
         msg = msg.replace(symbol,emote)

print(msg)

Upvotes: 0

Related Questions