Reputation: 953
Hi I am trying to use unicodedata
in python 3.7 on Linux but unfortunately it fails. Any help is highly appreciated.
I was looking on the net for the same issue but I couldn't find any hint which points me on the right direction.
My problem: I make use of unicodedata.name(string)
and there I get an error TypeError: name() argument 1 must be a unicode character, not str
.
Mininal working example
#!/usr/bin/env python3
import re
import emoji
import unicodedata
def replace_emoji(document):
emoji_all = emoji.EMOJI_ALIAS_UNICODE.items()
emoji_items = []
emoji_pattern = re.compile(u'|'.join(
re.escape(u[1]) for u in emoji_all), flags=re.UNICODE)
emoji_items = re.findall(emoji_pattern, document)
for item in emoji_items:
unicodes = []
unicode_values = []
for char in range(len(item)):
if not len(item) > 1:
unicodes.append(r'{:x}'.format(ord(item[char])).upper())
unicode_values.append([hex(ord(x)) for x in item[char]][0])
char_length = len(unicode_values)
chars = [chr(int(u, 16)) for u in unicode_values]
if char_length == 2:
print(chars)
value = u'\\U{:x}\\U{:x}'.format(
ord(chars[0]), ord(chars[1])).upper()
unicodedata.name(value)
return document
My test run
print(replace_emoji(u'🇯🇵🇰🇰🇷🇷🇩🇪🇨🇨🇳🇺🇺🇸🇫🇫🇷🇷🇪🇸🇸🇮🇹🇹🇷🇺🇬🇬🇧'))
Upvotes: 0
Views: 4430
Reputation: 691
I believe you can treat all emoji chars as normal characters in python 3.
Can't test the code atm, but I think this should do it.
import emoji
import unicodedata
def replace_emojis(document):
emoji_chars = emoji.EMOJI_ALIAS_UNICODE.values()
def _emoji(char):
if char in emoji_chars:
return unicodedata.name(char)
return ''.join(_emoji(char) or char for char in document)
Upvotes: 2