convert utf codes to letters

Question

I'm using Ruby

I have such value in the database

25.02.2020 13:57:56: u:633;644;627;645; u:627;633;62a;627;62f; u:645;6cc;634;647; u:644;637;641; u:6a9;646;6cc;62f; u:622;645;648;632;634; u:6a9;627;631; u:628;627; u:627;6cc;646; u:628;631;646;627;645;647; u:627;632; u:635;641;631; u:62a;627; u:635;62f; u:631;648; u:628;641;631;645;627;6cc;6cc;62f; u:645;645;646;648;646; u:627;632; u:634;645;627; Сервис

is it possible to convert it to normal persian letters with a ruby? it must be like this

13:57:56: سلام استاد میشه لطف کنید آموزش کار با این برنامه از صفر تا صد رو بفرمایید ممنون از شما Сервис

I will be appreciative for your help

Stefan · Accepted Answer

The Unicode values in your string have a specific pattern:

u:___;___;___; where each ___ is a hexadecimal value representing a codepoint.

You can match that pattern using a regexp, and replace the encoded values with their respective Unicode chars via gsub:

str = '25.02.2020 13:57:56:
u:633;644;627;645; u:627;633;62a;627;62f; u:645;6cc;634;647; u:644;637;641; u:6a9;646;6cc;62f; u:622;645;648;632;634; u:6a9;627;631; u:628;627; u:627;6cc;646; u:628;631;646;627;645;647; u:627;632; u:635;641;631; u:62a;627; u:635;62f; u:631;648; u:628;641;631;645;627;6cc;6cc;62f; u:645;645;646;648;646; u:627;632; u:634;645;627;


Сервис'

str.gsub(/u:((?:\h+;)+)/) { Regexp.last_match(1).split(';').map(&:hex).pack('U*') }
#=> "25.02.2020 13:57:56:\nسلام استاد میشه لطف کنید آموزش کار با این برنامه از صفر تا صد رو بفرمایید ممنون از شما\n\n\nСервис"

Step by step: (for each match)

the regexp matches "u:633;644;627;645;"
Regexp.last_match(1) returns the 1st capture group "633;644;627;645;"
split(';') turns that into ["633", "644", "627", "645"]
map(&:hex) converts the elements to [1587, 1604, 1575, 1605]
pack('U*') interprets them as Unicode codepoints and returns "سلام"

convert utf codes to letters

Answers (1)

Related Questions