Reputation: 2509
I needed to use Oracle 11g's Contains() function to search some exact text contained in some field typed by the user. I was asked not to use the 'like' operator.
According to the Oracle documentation, for everything to work you need to:
}
characters{}
This works in most cases except for a few ones. Below it a test case:
create table theme
(name varchar2(300 char) not null);
insert into theme (name)
values ('a');
insert into theme (name)
values ('b');
insert into theme (name)
values ('a or b');
insert into theme (name)
values ('Pdz344_1_b');
create index name_index on theme(name) indextype is ctxsys.context;
If the 'or'
operator was interpreted, I would get all four results, which is hopefully not the case. Now if I run the following, I would expect is to only find 'a or b'
.
select * from theme
where contains(name, '{a or b}')>0;
However I also get 'Pdz344_1_b'
. But there's no 'a'
, 'o'
not 'r'
and I find it very surprising that this text is matched. Is there something I don't get about contains()'s syntax?
Upvotes: 2
Views: 2359
Reputation: 1804
CONTAINS
is not like LIKE
operator at all. Since it using ORACLE TEXT
search engine (something like google search), not just string matching.
{}
- is an escape marker. Means everything you put inside should be treated as text to escape.
Therefore you issue query to find text that looks like a or b
not like a
or b
.
So your query get matched against Pdz344_1_b
because it has b
char in it.
Row with only a
character ain't matched because a
character exists in the default stop list.
Why just b
ain't matched? Because your match sequence actually looks like a\ or\ b
.
So we have 3 tokens a
_or
_b
(underscores represents spaces). a
in stop list, and we have no string _b
in the b
row, because there only single character. But we do have this combination in the Pdz344_1_b
row, because non-alphabetic characters are treated as whitespace. If you remove {}
or query for {b or a}
then you'll get matches against b
as well.
Upvotes: 4