djailer
djailer

Reputation: 351

Sybase Multiple Substrings Search

I need to get data back from a text field. The input is not all going to be pretty...some of the users don't spell well or consistently. I need to look for a variety of misspellings as well as alternative terms.

I am working with Sybase ASE and am wondering if the AND statement is getting unwieldy and may not be optimal? Here is one attempt:

AND (entry_txt like 'fight' OR
    entry_txt like 'confron%' OR
    entry_txt like 'aggres%' OR
    entry_txt like 'grab' OR
    entry_txt like 'push' OR
    entry_txt like 'strike' OR
    entry_txt like 'hit' OR
    entry_txt like 'assa%')

It will get longer as I add some new requirements for additional terms as well as some proprietary names and 8-9 more variations therein! Is there a more efficient way to do this or is that it?

I have also read that LIKE should be used for partial string comparison and IN for values from a set. How about values from a set of partial strings? Could I /should I use IN here and does that help performance?

I am searching thousands of docs so there is a lot of data to have to go through.

Upvotes: 1

Views: 5472

Answers (2)

abe_crabbers
abe_crabbers

Reputation: 201

You can actually put "like" expressions in an expression - another column in a table, or a variable.

So you could create a table with one varchar column called "like_expr" or something like that.

Then put all the above expressions into it, including the ones without % in, because they'll just degenerate to an equality operation.

In terms of efficiency, if entry_txt is indexed then the index can be used. I would think Sybase would find it easier to join to the like_expr table than to do lots and lots of ORs, but both should use the index - that should be a separate issue.)

create table abe (a varchar(20))

insert abe values ('hello')

create table abe2 (l varchar(20))

insert abe2 values ('h%')

select * from abe a where exists (select 1 from abe2 where a.a like l)

a

hello

Upvotes: 0

aF.
aF.

Reputation: 66747

Yes, for the ones that you don't have % you can use IN, for the others you still need to use OR.

It would look something like this:

AND (entry_txt in ('fight', 'grab', 'push', 'strike', 'hit')
     OR entry_txt like 'confron%'
     OR entry_txt like 'aggres%'
     OR entry_txt like 'assa%')

Upvotes: 2

Related Questions