wormsparty
wormsparty

Reputation: 2509

Space characters inside Oracle's Contains() function

I needed to use Oracle 11g's Contains() function to search some exact text contained in some field typed by the user. I was asked not to use the 'like' operator.

According to the Oracle documentation, for everything to work you need to:

This works in most cases except for a few ones. Below it a test case:

create table theme
  (name varchar2(300 char) not null);

insert into theme (name)
values ('a');

insert into theme (name)
values ('b');

insert into theme (name)
values ('a or b');

insert into theme (name)
values ('Pdz344_1_b');

create index name_index on theme(name) indextype is ctxsys.context;

If the 'or' operator was interpreted, I would get all four results, which is hopefully not the case. Now if I run the following, I would expect is to only find 'a or b'.

select * from theme
where contains(name, '{a or b}')>0;

However I also get 'Pdz344_1_b'. But there's no 'a', 'o' not 'r' and I find it very surprising that this text is matched. Is there something I don't get about contains()'s syntax?

Upvotes: 2

Views: 2359

Answers (1)

Nagh
Nagh

Reputation: 1804

CONTAINS is not like LIKE operator at all. Since it using ORACLE TEXT search engine (something like google search), not just string matching.

{} - is an escape marker. Means everything you put inside should be treated as text to escape.

Therefore you issue query to find text that looks like a or b not like a or b. So your query get matched against Pdz344_1_b because it has b char in it.

Row with only a character ain't matched because a character exists in the default stop list.

Why just b ain't matched? Because your match sequence actually looks like a\ or\ b. So we have 3 tokens a _or _b (underscores represents spaces). a in stop list, and we have no string _b in the b row, because there only single character. But we do have this combination in the Pdz344_1_b row, because non-alphabetic characters are treated as whitespace. If you remove {} or query for {b or a} then you'll get matches against b as well.

Upvotes: 4

Related Questions