Reputation: 149
I have a table with a column field
that has values like Samsung Phone.
My question is how can I get this row if I search for a string "Samsung" or "phone". I do not want any result if I give just "Sam" or "ph" as a search term.
I had tried with ILIKE operator but if I use:
select *
from mytable
where title ILIKE 'Samsung';
It does not give any results as it considers title should be exactly equal. Also if I use title ILIKE '%Samsung%'
then it will work for any partial strings.
In short, I want if the input search term exists as a whole word in my column then only give the search result.
Also, if I have multiple search term values, then any word match should be the result
Upvotes: 10
Views: 11779
Reputation: 626689
The following solutions are tested in PostgreSQL 9.6.
You may use the \y
word boundary and a grouping construct with alternation operator separating all possible alternatives:
where title ~* '\y(?:Samsung|phone)\y'
Or, case insensitively:
where title ~* '\y(?:Samsung|phone)\y'
See the PostgreSQL demo.
Note that \y
is not the right approach when the first or last char are not word chars. E.g. you want to search for #samsung
or phone#
. Then, consider using unambiguous word boundaries:
where title ~* '(?<!\w)(?:#samsung|phone#)(?!\w)'
Here, the (?<!\w)
negative lookbehind fails the match if there is a non-alnum and no _
char immediately to the left of the current position, and the (?!\w)
is a negative lookahead fails the match if there is a non-alnum and no _
char immediately to the right of the current position. Equivalent for testing if there is a match in where
clause is:
where title ~* '(\W|^)(?:#samsung|phone#)(\W|$)'
See another PostgreSQL demo online:
CREATE TABLE mmytable
(title character varying)
;
INSERT INTO mmytable
(title)
VALUES
('#Samsung Co.'),
('They have a phone#'),
('Uncle Sam phoned him')
;
select * from mmytable where title ~* '(?<!\w)(?:#samsung|phone#)(?!\w)';
Also, consider whitespace boundaries when you need to match only in between whitespace characters or start/end of string:
where title ~* '(?<!\S)(?:#samsung|phone#)(?!\S)'
where title ~* '(\s|^)(?:#samsung|phone#)(\s|$)'
Or, in case you do not know if there can be a special char at the start or end of the string:
where title ~* '(?!\Y\w)(?:#samsung|phone#)(?<!\w\Y)'
where (?!\Y\w)
and (?<!\w\Y)
are adaptive dynamic word boundaries.
Upvotes: 2
Reputation: 170
Also you can use this:
title ~* '(^|[^\w])samsung([^\w]|$)'
The advantage of the above one is, it can be easily extended to include characters from different encoding like this: (Turkish characters)
title ~* '(^|[^\wğüşıöçĞÜŞİÖÇ])samsung([^\wğüşıöçĞÜŞİÖÇ]|$)
Here is a sample case where you may need the above extension.
For example; in a Latin5 encoded database you have a value of 'İsamsung'. İ is the capital of i in Turkish.
In this case the title ~* '(\msamsung\M)'
does not work.
This criteria returns the İsamsung value. Because in Latin5 encoding, postgre thinks İ is out of alpanumeric range and the value is something similiar to :samsung.
Upvotes: 3
Reputation:
You can use a regular expression for this:
where title ~* '(\mphone\M)|(\msamsung\M)'
The above only returns values where phone
or samsung
are complete words. The regex modifiers\m
and \M
make the pattern only match for whole words.
The regex operator ~*
makes this case-insensitive. The above expression would return Samsung Phone
or Google Phone
but not Sam's House
.
If you want to add more words, just add them using the "or" operator |
where title ~* '(\mphone\M)|(\msamsung\M)|(\mbhat\M)'
Note that this kind of search is not going t1o be super-fast. Regular expressions are expensive they cannot make use of any index.
Upvotes: 16