أحمد صوالحة
أحمد صوالحة

Reputation: 323

Oracle Text , How to remove Punctuation

I am using Oracle text in Indexing URLs for Arabic web sites, I noticed that when indexing the site it does not ignore punctuation, since Arabic is not supported and has no thesaurus. when I search for a sentence without punctuation , the score is low, and returns bad results when I search with punctuation, some times it returns this error:

ORA-20000: Oracle Text error

DRG-50962: Query operators are not allowed in transform input string

I know what this error means, some sentences contain Oracle Text operators, How can I ignore them in searching ( sending sentence as is ), or what is the list of operators to remove. NOTE: I am using Query rewrite and escape sequence

    (select  /*+ FIRST_ROWS(1)*/  id,score(1) as sc1, isn ,sentence_length,URL from    plag_web_temp_docsentences 
              where contains(URL,'<query>
   <textquery>' || OriginalSentence ||'
     <progression>
      <seq><rewrite>transform((TOKENS,  "{", "}", "{ }"))</rewrite></seq>

     </progression>
   </textquery>
  <score datatype="INTEGER" algorithm="COUNT"/>
</query>',1)>0 

Upvotes: 1

Views: 456

Answers (1)

أحمد صوالحة
أحمد صوالحة

Reputation: 323

Ok , I think query rewriting does not permit escape sequence ( no reference for that, but just my experience ) so I used the escape sequence normally and my query looks like this, and it worked

select  /*+ FIRST_ROWS(1)*/  id,score(1) as sc1, isn ,sentence_length,URL from                plag_web_temp_docsentences 
              where contains(URL,'{'|| OriginalSentence ||'}',1)>0 ;

but if someone has another solution, please suggest

Upvotes: 1

Related Questions