Reputation: 21
I need to extract information from a text field which can contain one of many values. The SQL looks like:
SELECT fieldname
FROM table
WHERE bigtextfield LIKE '%val1%'
OR bigtextfield LIKE '%val2%'
OR bigtextfield LIKE '%val3%'
.
.
.
OR bigtextfield LIKE '%valn%'
My question is: how efficient is this when the number of values approaches the hundreds, and possibly thousands? Is there a better way to do this?
One solution would be to create a new table/column with just the values I'm after and doing the following:
SELECT fieldname
FROM othertable
WHERE value IN ('val1', 'val2', 'val3', ... 'valn')
Which I imagine is a lot more efficient as it only has to do exact string matching. The problem with this is that it will be a lot of work keeping this table up to date.
btw I'm using MS SQL Server 2005.
Upvotes: 2
Views: 4073
Reputation: 96552
The best solution for this is to redesign and get rid of that field that is storing multiple values and make it a related table instead. This violates one of the first rules of database design.
You should not be storing multiple values in one field and dead slow queries like this are the reason why. If you can't do that then full-test indexing is your only hope.
Upvotes: 1
Reputation: 20456
This functionality is already present in most SQL engines, including MS SQL Server 2005. It's called full-text indexing; here are some resources:
Upvotes: 5
Reputation: 754230
I don't think the main problem is the number of criteria values - but the sheer fact that a WHERE clause with bigtextfield LIKE '%val1%'
can never really be very efficient - even with just a single value.
The trouble is the fact that if you have a placeholder like "%" at the beginning of your search term, all the indices are out the window and cannot be used anymore.
So you're basically just searching each and every entry in your table doing a full table scan in the process. Now your performance basically just depends on the number of rows in your table....
I would support intgr's recommendation - if you need to do this frequently, have a serious look at fulltext indexing.
Upvotes: 2
Reputation: 425251
This will inevitable require a fullscan (over the table or over an index) with a filter.
The IN
condition won't help here, since it does not work on LIKE
You could do something like this:
SELECT *
FROM master
WHERE EXISTS
(
SELECT NULL
FROM values
WHERE name LIKE '%' + value + '%'
)
, but this hardly be more efficient.
All literal conditions will be transformed to a CONSTANT SCAN
which is just like selecting from the same table, but built in memory.
Upvotes: 1