Arda
Arda

Reputation: 6916

Remove Emoji on MariaDB Select Query

I'm trying to remove emoji on Select query. I'm using MariaDB 10.3.14

Here's what I came up so far, but it doesn't work. It echoes all the string.

SELECT REGEXP_REPLACE("<çşığ_lorem😒ipsum😊dolor", '/([\x10000-\x10FFFF]+)/g', '')

There will be some non-English, even some HTML tags in the actual string.

What I want to achieve from this string is <çşığ_loremipsumdolor.

How do I remove emoji from string using MariaDB 10.3?

Thanks in advance,

Upvotes: 4

Views: 1292

Answers (1)

user1673579
user1673579

Reputation: 134

Your regex looks correct. I think the problem here is MariaDB itself. There exists a confirmed issue regarding the underlying cause: MDEV-11777

As an ugly workaround you could convert the text to utf8 (see this discussion for some details why this works). This will show emojis and other characters as a ?.

Afterwards you just remove the ?.

So as a complete example:

SELECT REGEXP_REPLACE(convert('<çşığ_lorem😒ipsum😊dolor' USING utf8),'\\?','');

This will give you <çşığ_loremipsumdolor.

It will of course also replace any actual question marks from the string. But I am sure this can be worked around as well (e.g. by replacing the wanted question mark by a placeholder, then removing the unwanted question marks and replacing the placeholder with a question marks again).

Unfortunately, it all sounds like something I would not do in SQL.

Upvotes: 2

Related Questions