How to search for different character sets in postgresql?

Question

I want to search a table in a postgres DB which contains both Arabic and English text. For example:

id | content
-----------------
1  | دجاج    
2  | chicken
3  | دجاج chicken

The result would get me row 3.

I imagine this has to do with limiting characters using regex, but I cannot find a clean solution to select both. I tried:

SELECT regexp_matches(content, '^([x00-\xFF]+[a-zA-Z][x00-\xFF]+)*')
FROM mg.messages;

However, this only matches english and some non english characters within {}.

dwurf · Accepted Answer

I know nothing about Arabic text or RTL languages in general, but this worked:

create table phrase (
  id serial,
  phrase text
);

insert into phrase (phrase) values ('apple pie');
insert into phrase (phrase) values ('فطيرة التفاح');

select *
from phrase
where phrase like ('apple%')
or phrase like ('فطيرة%');

http://sqlfiddle.com/#!15/75b29/2

How to search for different character sets in postgresql?

Answers (2)

Related Questions