Peter R
Peter R

Reputation: 3516

Count matching words between two strings in Postgres?

I want to sort results based on the number of matching words in two strings (case insensitive). Is there a way given two strings to count the number of matching words that appear in both strings?

Example:

"Red blue black green", "Green Black Blue"
-> 3
"Blue Green", "green blue"
-> 2
"Blue Green", "Red blue"
-> 1
"green blue black", "orange purple"
-> 0

I want to use this in an Order By clause. Given that a table has a column with a string in it, I'll query that table, then order the results based on the rows with the most matching words in the passed in string.

Upvotes: 1

Views: 858

Answers (1)

GMB
GMB

Reputation: 222582

You can split both phrases to words with regexp_split_to_table(), and then count the matches. A lateral join comes handy for this.

select t.*, x.cnt_matches
from (values 
    ('Red blue black green', 'Green Black Blue'),
    ('Blue Green',           'green blue'),
    ('Blue Green',           'red blue'),
    ('green blue black',     'orange purple')
) as t(str1, str2)
cross join lateral (
    select count(*) cnt_matches
    from regexp_split_to_table(lower(t.str1), ' ') w1(word)
    inner join regexp_split_to_table(lower(t.str2), ' ') w2(word)
        on w1.word = w2.word
) x

Demo on DB Fiddle:

str1                 | str2             | cnt_matches
:------------------- | :--------------- | ----------:
Red blue black green | Green Black Blue |           3
Blue Green           | green blue       |           2
Blue Green           | red blue         |           1
green blue black     | orange purple    |           0

Upvotes: 1

Related Questions