Reputation: 6516
I have a postgresql
database and a
table "phrases"
with a collection of words/ word groups separated by a "," in each row:
id | collection (text) |
---+----------------------------+
1 | apple tree, large orange |
2 | kiwi, apple tree |
3 | large orange, melon |
4 | large orange, banana beach |
and another
table "pattern"
with words (some might occur multiple times):
id | words (var char) |
---+----------------------+
1 | apple tree |
2 | large orange |
3 | large orange |
What I want is to have a new table containing the count of occurrences per distinct words matching with the collection column.
So this:
| words | count |
+--------------+-------+
| apple tree | 2 |
| large orange | 3 |
This is how far I have gotten, but I am stuck:
SELECT phrases.collection
FROM phrases
WHERE phrases.collection LIKE any
( SELECT
( SELECT ARRAY(
SELECT DISTINCT(CONCAT('%'::text , pattern.words::text , '%'::text))
FROM pattern)
)
)
Even this doesn't seem to do work - and I am completely missing the group by count statement and clueless how to achieve it. Any help is highly appreciated.
Upvotes: 0
Views: 69
Reputation: 28313
SELECT
words, COUNT(*)
FROM (SELECT distinct words FROM pattern) w
LEFT JOIN phrases ON collection LIKE '%' || words || '%'
GROUP BY 1
If the type is text, you can concat '%'
around the pattern to check if exists in the phrase.
The downside is that %apple tree%
would also match on pineapple tree
Upvotes: 0
Reputation: 42728
SELECT DISTINCT words, COUNT(DISTINCT phrases.id) cnt
FROM phrases
JOIN pattern ON POSITION(words IN collection) > 0
GROUP BY words;
Upvotes: 0
Reputation: 6723
You can use unnest(string_to_array(... to expand the phrases into separate rows like so:
SELECT words, count(*)
FROM (select distinct words FROM pattern) AS sub1
JOIN (select id, unnest(string_to_array(collection, ', ')) as p FROM phrases) AS sub2
on words = p
GROUP BY words
;
Here's a fiddle
Upvotes: 2