user1036204
user1036204

Reputation: 175

match a string from a list of values and create new row

This is for all SQL developers over here, this would be easy but thought to post. I have a table which has all country names in world. I have another table which has a sentence which would have more than one country name in it. How to extract the country names and place them in a separate row like below

Country_Universe_Table
America
India
Bhutan
Nepal
Iran

dataid             Comments
1            America and India has signed a deal
2            UK, Nepal and Bhutan see cold weather

The output shall come as
dataid             Country
1             America
1             India
2             UK
2             Nepal
2             Bhutan

Upvotes: 2

Views: 192

Answers (2)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 659287

Short:
Convert your string to an array and use the array contains operator @>:

SELECT c.dataid, cut.country
FROM   comments c
JOIN   country_universe_table cut
       ON string_to_array(c.comments, ' ') @> ARRAY[cut.country];

Your column should really be named comment (singular).

Faster:

SELECT c.dataid, cut.country
FROM   comments c
     , unnest(string_to_array(c.comments, ' ')) u(country)
JOIN   country_universe_table cut USING (country);

Related:

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1271151

The following is an imperfect method of doing what you want:

select c.dataid, cut.country
from Country_Universe_Table cut join
     comments c
     on c.comment like '%' || cut.country || '%'

This just looks for the country string in the comments table. However, it will match "American" to "America" and "UKELELE" to "UK". This may be close enough for what you need.

EDIT:

You can enhance this (in more recent versions of Postgres) by splitting the comment into words and then doing the comparison using a lateral join:

select c.dataid, cut.country
from comments c cross join lateral
     regexp_split_to_table(c.comment, E'[.,-() ]+') as w(word) join
     Country_Universe_Table cut
     on w.word = cut.country;

Upvotes: 3

Related Questions