Sachin Singh
Sachin Singh

Reputation: 948

Is there and alternative to LIKE statement in T-SQL?

I have a scenario where I need to perform following operation:

SELECT *
FROM
[dbo].[MyTable]
WHERE
[Url] LIKE '%<some url>%';

I have to use two % (wildcard characters) at the beginning and the end of Url ('%<some url>%') as user should be able to search the complete url even if he types partial text. For example, if url is http://www.google.co.in and user types "goo", then the url must appear in search results. LIKE operator is causing performance issues. I need an alternative so that I can get rid of this statement and wildcards. In other words, I don't want to use LIKE statement in this scenario. I tried using T-SQL CONTAINS but it is not solving my problem. Is there any other alternative available than can perform pattern matching and provide me results quickly?

Upvotes: 4

Views: 50791

Answers (5)

paparazzo
paparazzo

Reputation: 45106

Starting a like with a % is going to cause a scan. No getting around it. It has to evaluate every value.

If you index the column it should be an index (rather than table) scan.

You don't have an alternative that will not cause a scan.
Charindex and patindex are alternatives but will still scan and not fix the performance issue.

Could you break the components out into a separate table?
www
google
co
in

And then search on like 'goo%'?
That would use an index as it does not start with %.

Better yet you could search on 'google' and get an index seek.

And you would want to have the string unique in that table with a separate join on Int PK so it does not return multiple www for instance.

Suspect FullText Contains was not faster because FullText kept the URL as one word.

Upvotes: 9

Thorsten Kettner
Thorsten Kettner

Reputation: 95101

Your query is a very simple one and I see no reason for it to be slow. The dbms wil read record for record and compare strings. Usually it can even do this in parallel threads.

What do you think can be the reason for your statement being so slow? Are there billions of records in your table? Do your records contain so much data?

Your best bet is not to care about the query, but about the database and your system. Others have already suggested an index on the url column, so rather than scanning the table, the index can be scanned. Is max degree of parallelism mistakenly set? Is your table fragmented? Is your hardware appropriate? These are the things to consider here.

However: charindex('oogl', url) > 0 does the same as url like '%oogl%', but internally they work differently somehow. For some people the LIKE expression turned out faster, for others the CHARINDEX method. Maybe it depends on the query, number of processors, operating system, whatever. It may be worth a try.

Upvotes: 2

JohnLBevan
JohnLBevan

Reputation: 24490

To my knowledge there's no alternative to like or contains (full text search feature) which would give better performance. What you can do is try to improve performance by optimising your query. To do that, you need to know a bit about your users & how they'll use your system. I suspect most people will enter a URL from the start of the address (i.e. without protocol), so you could do something like this:

declare @searchTerm nvarchar(128) = 'goo'
set @searchTerm = coalesce(replace(@searchTerm ,'''',''''''),'')
select @searchTerm

SELECT *
FROM [dbo].[MyTable]
WHERE [Url] LIKE 'http://' + @searchTerm + '%'
or [Url] LIKE 'https://' + @searchTerm + '%'
or [Url] LIKE 'http://www.' + @searchTerm + '%'
or [Url] LIKE 'https://www.' + @searchTerm + '%'
or [Url] LIKE '%' + @searchTerm + '%'
option (fast 1); --get back the first result asap; 

That then gives you some optimisation; i.e. if the url's http://www.google.com the index on the url column can be used since http://www.goo is at the start of the string. The option (fast 1) piece on the end's to ensure this benefit is seen; since the last URL like %searchTerm% can't make use of indexes, we'd rather return responses as soon as we can rather than wait for that slow part to complete. Have a think of other common usage patterns and ways around those.

Upvotes: 1

DavidG
DavidG

Reputation: 119186

You could create a FULLTEXT index.

First create your catalog:

CREATE FULLTEXT CATALOG ft AS DEFAULT;

Now assuming your table is called MyTable, the column is TextColumn and it has a unique index on it called UX_MyTable_TextColumn:

CREATE FULLTEXT INDEX ON [dbo].[MyTable](TextColumn) 
    KEY INDEX UX_MyTable_TextColumn

Now you can search the table using CONTAINS:

SELECT *
FROM MyTable
WHERE CONTAINS(TextColumn, 'searchterm')

Upvotes: 1

Jeroen Mostert
Jeroen Mostert

Reputation: 28809

As written, your query cannot be further optimized, and there is no way of getting around the LIKE to do your searching. The only thing you can do to improve performance is reduce the SELECT to return only the columns you need if you don't need all of them, and create an index on URL with those columns included. The LIKE will not be able to use the index for seeking, but the reduced data size for scanning can help. If you have a SQL Server edition that supports compression, that will help as well.

For instance, if you really need only column A, write

SELECT A FROM [dbo].[MyTable] WHERE [Url] LIKE '%<some url>%';

And create the index as

CREATE INDEX IX_MyTable_URL 
ON MyTable([Url]) 
INCLUDE (A) WITH (DATA_COMPRESSION = PAGE);

If A is already included in your primary key, the INCLUDE is unnecessary.

Upvotes: 0

Related Questions