Paul
Paul

Reputation: 45

Counting the Number of Occurrences of a Multi-Word Phrase in Text with PostgreSQL

I have a problem, I need to count the frequency of a word phrase appearing within a text field in a PostgreSQL database.

I'm aware of functions such as to_tsquery() and I'm using it to check if a phrase exists within the text using to_tsquery('simple', 'sample text'), however, I'm unsure of how to count these occurrences accurately.

Upvotes: 0

Views: 88

Answers (1)

msanzal
msanzal

Reputation: 119

If the words are contained just once in the string (I am supposing here that your table contains two columns, one with an id and another with a text column called my_text):

SELECT
     count(id)
FROM
     my_table
WHERE
     my_text ~* 'the_words_i_am_looking_for'

If the occurrences are more than one per field, this nested query can be used:

SELECT
     id,
     count(matches) as matches
FROM (
     SELECT
          id,
          regexp_matches(my_text, 'the_words_i_am_looking_for', 'g') as matches
     FROM
          my_table
) t
GROUP BY 1

The syntax of this function and much more about string pattern matching can be found here.

Upvotes: 1

Related Questions