Oracle Text Contains and Numbers with Separators

Question

I'm using a ctxsys.context index on one column to facilitate the Oracle Text full-text search feature. But having a problem when indexing numeric values that are separated by ',' or '.'.

I created the index like:

create index my_index on my_table(my_column)
indextype is ctxsys.context parameters ('SYNC (ON COMMIT)');

Then I insert four text documents:

insert into my_table (id, doc) values (1, 'FOO 300 BAR');
insert into my_table (id, doc) values (2, 'FOO 300 BAR 1,000.00');
insert into my_table (id, doc) values (3, 'FOO1FOO');
insert into my_table (id, doc) values (4, '1 FOO');

Now I would like to use the contains operator to search for 'FOO 300 BAR', '1,000.00' and a combination of both:

select score(1), id from my_table where contains(doc, 'FOO 300 BAR', 1) > 0;
select score(1), id from my_table where contains(doc, '1,000.00', 1) > 0;
select score(1), id from my_table where contains(doc, 'FOO 300 BAR 1,000.00', 1) > 0;

First one works as expected and I get both id 1 and 2 as a result. Although when I try to use 1,000.00 I get 0 rows as result.

As I read from the documentation it is using BASIC_LEXER as default. I also tried to specify the separators explicitly on the lexer and applied it to the index.

begin
ctx_ddl.create_preference('my_lex', 'BASIC_LEXER');
ctx_ddl.set_attribute('my_lex', 'numjoin', '.');
ctx_ddl.set_attribute('my_lex', 'numgroup', ',');
end;

create index my_index on my_table(doc)
indextype is ctxsys.context parameters ('SYNC (ON COMMIT) LEXER my_lex');

But I experienced the same behavior as before.

Could someone explain how Oracle Text treats numbers with separators and how I could configure the index so that separated numbers are treated as single words?

I'm using Oracle Database 11g Express Edition Release 11.2.0.2.0 - 64bit Production

Oracle Text Contains and Numbers with Separators

Answers (1)

Related Questions