Reputation: 1014
I have a table with a string column. I want to remove the stop words. I used this query which seems Ok.
SELECT to_tsvector('english',colName)from tblName order by colName asc;
it does not update the column in table
I want to see the stop words of Postgresql and what the query found.Then in case I can replace it with my own file. I also checked this address and could not find the stop words list file. Actually, the address does not exist.
$SHAREDIR/tsearch_data/english.stop
Upvotes: 1
Views: 4972
Reputation: 1
The chosen answer did not match my requirement, but I found a solution for this:
SELECT regexp_replace('Bill and Susi, hand over or die!', '[^ ]*$','');
regexp_replace
-----------------------------
Bill and Susi, hand over or
(1 row)
Upvotes: 0
Reputation: 246848
There is no function to do that.
You could use something like this (in this example in German):
SELECT array_to_string(tsvector_to_array(to_tsvector('Hallo, Bill und Susi!')), ' ');
array_to_string
-----------------
bill hallo susi
(1 row)
This removes stop words, but also stems and non-words, and it does not care about word order, so I doubt that the result will make you happy.
If that doesn't fit the bill, you can use regexp_replace
like this:
SELECT regexp_replace('Bill and Susi, hand over or die!', '\y(and|or|if)\y', '', 'g');
regexp_replace
-----------------------------
Bill Susi, hand over die!
(1 row)
But that requires that you include your list of stop words in the query string. An improved version would store the stop words in a table.
Upvotes: 3