code to replace emoticons by "SAD" or "HAPPY" not working properly

Question

So I wanted to replace all the happy emoticons with "HAPPY" and vice versa "SAD" for sad emoticons for a text file. But the code isnt working properly. Though it detects smileys (as of now :-) ), but in the below example its not replacing the emoticon with text, its simply appending the text and that too its appending it two times for reasons I dont seem to understand.

dict_sad={":-(":"SAD", ":(":"SAD", ":-|":"SAD",  ";-(":"SAD", ";-<":"SAD", "|-{":"SAD"}
dict_happy={":-)":"HAPPY",":)":"HAPPY", ":o)":"HAPPY",":-}":"HAPPY",";-}":"HAPPY",":->":"HAPPY",";-)":"HAPPY"}

#THE INPUT TEXT#
a="guys beautifully done :-)" 

for i in a.split():
    for j in dict_happy.keys():
        if set(j).issubset(set(i)):
            print "HAPPY"
            continue
    for k in dict_sad.keys():
        if set(k).issubset(set(i)):
            print "SAD"
            continue
    if str(i)==i.decode('utf-8','replace'):
       print i

THE INPUT TEXT

a="guys beautifully done :-)"

OUTPUT ("HAPPY" is coming two times, also the emoticon isnt getting away)

guys
-
beautifully
done
HAPPY
HAPPY
:-)

EXPECTED OUTPUT

guys
beautifully
done
HAPPY

Martijn Pieters · Accepted Answer

You are turning each word and each emoticon to a set; this means you are looking for overlap of individual characters. You probably wanted uses exact matches at most:

for i in a.split():
    for j in dict_happy:
        if j == i:
            print "HAPPY"
            continue
    for k in dict_sad:
        if k == i:
            print "SAD"
            continue

You can iterate over dictionaries directly, no need to call .keys() there. You don't actually appear to be using the dictionary values; you could just do:

for word in a.split():
    if word in dict_happy:
        print "HAPPY"
    if word in dict_sad:
        print "SAD"

and then perhaps use sets instead of dictionaries. This then can be reduced to:

words = set(a.split())
if dict_happy.viewkeys() & words:
    print "HAPPY"
if dict_sad.viewkeys() & words:
    print "SAD"

using the dictionary view on the keys as a set. Still, it would still be better to use sets then:

sad_emoticons = {":-(", ":(", ":-|", ";-(", ";-<", "|-{"}
happy_emoticons = {":-)", ":)", ":o)", ":-}", ";-}", ":->", ";-)"}

words = set(a.split())
if sad_emoticons & words:
    print "HAPPY"
if happy_emoticons & words:
    print "SAD"

If you wanted to remove the emoticon from the text, you'll have to filter the words:

for word in a.split():
    if word in dict_happy:
        print "HAPPY"
    elif word in dict_sad:
        print "SAD"
    else:
        print word

or better still, combine the two dictionaries and use dict.get():

emoticons = {
    ":-(": "SAD", ":(": "SAD", ":-|": "SAD", 
    ";-(": "SAD", ";-<": "SAD", "|-{": "SAD",
    ":-)": "HAPPY",":)": "HAPPY", ":o)": "HAPPY",
    ":-}": "HAPPY", ";-}": "HAPPY", ":->": "HAPPY",
    ";-)": "HAPPY"
}

for word in a.split():
    print emoticons.get(word, word)

Here I pass in the current word both as the look-up key and the default; if the current word is not an emoticon, the word itself is printed, otherwise the word SAD or HAPPY is printed instead.

code to replace emoticons by "SAD" or "HAPPY" not working properly

Answers (2)

Related Questions

code to replace emoticons by &quot;SAD&quot; or &quot;HAPPY&quot; not working properly

Answers (2)

Related Questions

code to replace emoticons by "SAD" or "HAPPY" not working properly