SQL query to match keywords?

Question

I have a table with a column as nvarchar(max) with text extracted from word documents in it. How can I create a select query that I'll pass another a list of keywords as parameter and return the rows ordered by the number of matches?

Maybe it is possible with full text search?

Aaron Bertrand · Accepted Answer

Yes, possible with full text search, and likely the best answer. For a straight T-SQL solution, you could use a split function and join, e.g. assuming a table of numbers called dbo.Numbers (you may need to decide on a different upper limit):

SET NOCOUNT ON;
DECLARE @UpperLimit INT;
SET @UpperLimit = 200000;

WITH n AS
(
    SELECT
        rn = ROW_NUMBER() OVER
        (ORDER BY s1.[object_id])
    FROM sys.objects AS s1
    CROSS JOIN sys.objects AS s2
    CROSS JOIN sys.objects AS s3
)
SELECT [Number] = rn - 1
INTO dbo.Numbers
FROM n
WHERE rn <= @UpperLimit + 1;

CREATE UNIQUE CLUSTERED INDEX n ON dbo.Numbers([Number]);

And a splitting function that uses that table of numbers:

CREATE FUNCTION dbo.SplitStrings
(
    @List NVARCHAR(MAX)
)
RETURNS TABLE
AS
    RETURN
    (
        SELECT DISTINCT
            [Value] = LTRIM(RTRIM(
                SUBSTRING(@List, [Number],
                CHARINDEX(N',', @List + N',', [Number]) - [Number])))
        FROM
            dbo.Numbers
        WHERE
            Number <= LEN(@List)
            AND SUBSTRING(N',' + @List, [Number], 1) = N','
    );
GO

Then you can simply say:

SELECT key, NvarcharColumn /*, other cols */
FROM dbo.table AS outerT
WHERE EXISTS
(
  SELECT 1 
    FROM dbo.table AS t 
    INNER JOIN dbo.SplitStrings(N'list,of,words') AS s
    ON t.NvarcharColumn LIKE '%' + s.Item + '%'
    WHERE t.key = outerT.key
);

As a procedure:

CREATE PROCEDURE dbo.Search
    @List NVARCHAR(MAX)
AS
BEGIN
    SET NOCOUNT ON;

    SELECT key, NvarcharColumn /*, other cols */
    FROM dbo.table AS outerT
    WHERE EXISTS
    (
      SELECT 1 
        FROM dbo.table AS t 
        INNER JOIN dbo.SplitStrings(@List) AS s
        ON t.NvarcharColumn LIKE '%' + s.Item + '%'
        WHERE t.key = outerT.key
    );
END
GO

Then you can just pass in @List (e.g. EXEC dbo.Search @List = N'foo,bar,splunge') from C#.

This won't be super fast, but I'm sure it will be quicker than pulling all the data out into C# and double-nested loop it manually.

SQL query to match keywords?

Answers (2)

Related Questions