maRtin
maRtin

Reputation: 6516

LIKE clause on column and group

I have a postgresql database and a

table "phrases"

with a collection of words/ word groups separated by a "," in each row:

id | collection        (text)   | 
---+----------------------------+
 1 | apple tree, large orange   | 
 2 | kiwi, apple tree           | 
 3 | large orange, melon        | 
 4 | large orange, banana beach | 

and another

table "pattern"

with words (some might occur multiple times):

id | words   (var char)   | 
---+----------------------+
 1 | apple tree           | 
 2 | large orange         | 
 3 | large orange         | 

What I want is to have a new table containing the count of occurrences per distinct words matching with the collection column.

So this:

| words        | count |
+--------------+-------+
| apple tree   | 2     |
| large orange | 3     |

This is how far I have gotten, but I am stuck:

    SELECT phrases.collection
    FROM phrases
    WHERE phrases.collection LIKE any 
                   ( SELECT 
                     ( SELECT ARRAY(
                            SELECT DISTINCT(CONCAT('%'::text , pattern.words::text , '%'::text))
                            FROM pattern)
                        )
                    )

Even this doesn't seem to do work - and I am completely missing the group by count statement and clueless how to achieve it. Any help is highly appreciated.

Upvotes: 0

Views: 69

Answers (3)

Haleemur Ali
Haleemur Ali

Reputation: 28313

SELECT
words, COUNT(*)
FROM (SELECT distinct words FROM pattern) w
LEFT JOIN phrases ON collection LIKE '%' || words || '%'
GROUP BY 1

If the type is text, you can concat '%' around the pattern to check if exists in the phrase.

The downside is that %apple tree% would also match on pineapple tree

Upvotes: 0

Akina
Akina

Reputation: 42728

SELECT DISTINCT words, COUNT(DISTINCT phrases.id) cnt
FROM phrases
JOIN pattern ON POSITION(words IN collection) > 0
GROUP BY words;

fiddle

Upvotes: 0

Jeremy
Jeremy

Reputation: 6723

You can use unnest(string_to_array(... to expand the phrases into separate rows like so:

SELECT words, count(*)
FROM (select distinct words FROM pattern) AS sub1
JOIN (select id, unnest(string_to_array(collection, ', ')) as p  FROM phrases) AS sub2
on words = p
GROUP BY words
;

Here's a fiddle

Upvotes: 2

Related Questions