Reputation: 1477
I have following table in PostgreSQL 11.0
name identifier name_matched
anti-angin® formula 3792673 anti angin
anti-angin® formula 3792673 anti angin
arava® 20 mg tablets D25155 arava 20 mg
nifedipine extended release tablets 60 mg 23766 nifedipine
I using following query to match name and name_matched values.
select * from tbl
where name ilike '%' || name_matched || '%'
Above query only give row4 as an output. How can I modify the query to fetch all four rows (ignoring special characters and then match the strings)
Any help is highly appreciated
Upvotes: 0
Views: 618
Reputation: 15614
select * from tbl
where
array(select x[1] from regexp_matches(name_matched, '([a-zA-Z0-9]+)', 'g') as x) <@
array(select x[1] from regexp_matches(name, '([a-zA-Z0-9]+)', 'g') as x);
or to make it shorter:
create function regexp_matches_array(astr text, apattern text)
returns text[]
language sql
immutable
strict
as $func$
select array_agg(x[1]) from regexp_matches(astr, apattern, 'g') as x
$func$
select * from tbl
where
regexp_matches_array(name_matched, '([a-zA-Z0-9]+)') <@
regexp_matches_array(name, '([a-zA-Z0-9]+)');
Note that it does not taking in account the words order.
Upvotes: 1
Reputation: 12494
This should work for the value combinations shown in your sample data. Given that you need to work around -
characters, there may be other cases you need to handle. If any turn up, comment, and I will see if I can help you work through them.
select *
from tbl
where replace(regexp_replace(name, '[^A-Za-z0-9 -]', ''), '-', ' ') ~*
replace(regexp_replace(name_matched, '[^A-Za-z0-9 -]', ''), '-', ' ')
;
Upvotes: 0