Reputation: 369
I want to find emoji in Python 3, and I print my string is \ud83d\ude0a
And I can found it by
re.compile(r'(\\ud83d\\ude0a)')
But when I want to use Square Brackets to find like \ud83d[\ude00-\ude4f]
I write this re.compile(r'(\\ud83d([\\ude00-\\ude4f]))');
but just mapping ude0a
in \ud83d\ude0a
.
my entire code
str = '\\ud83d\\ude0a'
print(str)
emoji_pattern = re.compile(r'(\\ud83d([\\ude00-\\ude4f]))');
# emoji_pattern = re.compile(r'(\\ud83d\\ude0a)');
print(emoji_pattern.sub(r'', str))
Upvotes: 0
Views: 205
Reputation: 76
The problem is in the way you use square brackets.
Square brackets are used for selecting a single char from the chars in the brackets. Therefor, when you wrote [\\ude00-\\ude4f]
, it will be translated to only one char in there (for example, \\
, u
, d
, 0
, etc.), and not as you wanted it to be, from \ud83d\ude00
to \ud83d\ude4f
.
To fix this, try using (\\ud83d(\\ude[0-4][0-9a-f]))
. It will find the sequence of the chars \ud83d\ude
and then char in the range of 0
to 4
and then one in the sequence of 0
to 9
or a
to f
. As a result, this will detect the wanted sequence, and can be inspect here.
Upvotes: 1