mattsmith5
mattsmith5

Reputation: 1093

Does Adding Indexes speed up String Wildcard % searches?

We are conducting a wildcard search on a database table with column string. Does creating a non-clustered index on columns help with wildcard searches? Will this improve performance?

CREATE TABLE [dbo].[Product](
    [ProductId] [int] NOT NULL,
    [ProductName] [varchar](250) NOT NULL,
    [ModifiedDate] [datetime] NOT NULL, 
    ...
    CONSTRAINT [PK_ProductId] PRIMARY KEY CLUSTERED 
    (
       [ProductId] ASC
    )
 )

Proposed Index:

CREATE NONCLUSTERED INDEX [IX_Product_ProductName] ON [dbo].[Product] [ProductName])

for this query

select * from dbo.Product where ProductName like '%furniture%'

Currently using Microsoft SQL Server 2019.

Upvotes: 3

Views: 2458

Answers (5)

RBarryYoung
RBarryYoung

Reputation: 56735

Creating a normal index will not help(*), but a full-text index will, though you would have to change your query to something like this:

select * from dbo.Product where ProductName CONTAINS 'furniture'

(* -- well, it can be slightly helpful, in that it can reduce a scan over every row and column in your table into a scan over merely every row and only the relevant columns. However, it will not achieve the orders of magnitude performance boost that we normally expect from indexes that turn scans into single seeks.)

Upvotes: 3

SQLpro
SQLpro

Reputation: 5131

First you can use FTS to search words into sentences even partially (beginning by).

For those ending by or for those containing you can use a rotative indexing technic:

CREATE TABLE T_WRD
(WRD_ID                BIGINT IDENTITY PRIMARY KEY,
 WRD_WORD              VARCHAR(64) COLLATE Latin1_General_100_BIN NOT NULL UNIQUE,
 WRD_DROW              AS REVERSE(WRD_WORD) PERSISTED NOT NULL UNIQUE,
 WRD_WORD2             VARCHAR(64) COLLATE Latin1_General_100_CI_AI NOT NULL) ;
GO
    
CREATE TABLE T_WORD_ROTATE_STRING_WRS
(WRD_ID                BIGINT NOT NULL REFERENCES T_WRD (WRD_ID),
 WRS_ROTATE            SMALLINT NOT NULL,
 WRD_ID_PART           BIGINT NOT NULL REFERENCES T_WRD (WRD_ID),
 PRIMARY KEY (WRD_ID,  WRS_ROTATE));
GO
    
CREATE OR ALTER TRIGGER E_I_WRD
ON T_WRD
FOR INSERT
AS
    
SET NOCOUNT ON;
    
-- splitting words
WITH R AS
(
SELECT WRD_ID, TRIM(WRD_WORD) AS WRD_WORD, 0 AS ROTATE
FROM   INSERTED
UNION ALL
SELECT WRD_ID, RIGHT(WRD_WORD, LEN(WRD_WORD) -1), ROTATE + 1
FROM   R
WHERE  LEN(WRD_WORD) > 1
)
SELECT *
INTO #WRD
FROM   R;
    
-- inserting missing words
INSERT INTO T_WRD (WRD_WORD, WRD_WORD2)
SELECT WRD_WORD, LOWER(WRD_WORD) COLLATE SQL_Latin1_General_CP1251_CI_AS
FROM   #WRD
WHERE  WRD_WORD NOT IN (SELECT WRD_WORD 
                        FROM T_WRD);
    
-- inserting cross reference words
INSERT INTO T_WORD_ROTATE_STRING_WRS
SELECT M.WRD_ID, ROTATE, D.WRD_ID
FROM   #WRD AS M
      JOIN T_WRD AS D
         ON M.WRD_WORD = D.WRD_WORD
WHERE  NOT EXISTS(SELECT 1/0
                  FROM   T_WORD_ROTATE_STRING_WRS AS S
                  WHERE  S.WRD_ID = M.WRD_ID
                  AND  S.WRS_ROTATE = ROTATE);
GO

Then now you can insert into the first table all the words you want from your sentences and finding it by ending of partially in querying those two tables...

As an example, word:

WITH 
T AS (SELECT 'électricité' AS W)
INSERT INTO T_WRD 
SELECT W, LOWER(CAST(W AS VARCHAR(64)) COLLATE SQL_Latin1_General_CP1251_CI_AS) AS W2
FROM  T;

You can now use :

SELECT * FROM T_WRD;
SELECT * FROM T_WORD_ROTATE_STRING_WRS;

To find those partial words

Upvotes: 1

It depends on the optimizer. Like usually requires a full table scan. if the optimizer can scan an index for matches than it will do an index scan which is faster than a full table scan.

if the optimizer does not select an index scan you can force it to use an index. You must measure performance times to determine if using an index scan decreases search time

Use with (index(index_name)) to force an index scan e.g.

select * from t1 with (index(t1i1)) where v1 like '456%'

SQL Server Index - Any improvement for LIKE queries?

If you use %search% pattern, the optimizer will always perform a full table scan.

Another technique for speeding up searches is to use substrings and exact match searches.

Upvotes: -2

Dale K
Dale K

Reputation: 27226

For a double ended wildcard search as shown, an index cannot help you by restricting the rows SQL Server has to look at - a full table scan will be carried out. But it can help with the amount of data that has to be retrieved from disk.

Because in ProductName like '%furniture%', ProductName could start or end with any string, so no index can reduce the rows that have to be inspected.

However if a row in your Product table is 1,000 characters and you have 10,000 rows, you have to load that much data. But if you have an index on ProductName, and ProductName is only 50 characters, then you only have to load 10,000 * 50 rather than 10,000 * 1000.

Note: If the query was a single ended wildcard search with % at end of 'furniture%', then the proposed index would certainly help.

Upvotes: 3

Blindy
Blindy

Reputation: 67380

Yes, the part before the first % is matched against the index. Of course however, if your pattern starts with %, then a full scan will be performed instead.

Upvotes: -3

Related Questions