Banjo
Banjo

Reputation: 1251

Replacing emojis in a text

I try to replace emojis with their meaning.

Tweets$text[19]
"I ❤️ flying  . ☺️\U0001f44d"

For this task, I use the textclean package. The lexicon does not only include the emoji description but also the byte code representation (x: column):

hash_emojis[1:3]
              x                        y
1: <e2><86><95>            up-down arrow
2: <e2><86><99>          down-left arrow
3: <e2><86><a9> right arrow curving left

So the result looks like this:

Tweets$text[19] = replace_emoji(Tweets$text[19], emoji_dt = lexicon::hash_emojis)

Tweets$text[19]

 "I red heart <ef><b8><8f> flying . smiling face <ef><b8><8f> thumbs up "

I only want to get the description without the byte code representation because I have to clean it again. How can I apply only the "y column" to the text? Is their maybe a better way to deal with emojis in R?

Upvotes: 2

Views: 1115

Answers (1)

phiver
phiver

Reputation: 23598

After using replace_emoji, you can use replace_non_ascii to get rid of the ascii codes

text <- "I ❤️ flying  . ☺️\U0001f44d"
t <- replace_emoji(text)
replace_non_ascii(t)
"I red heart flying . smiling face thumbs up"

Upvotes: 2

Related Questions