Kaine
Kaine

Reputation: 579

SQL Server ranking weirdness using FREETEXTTABLE across multiple columns

I have been struggling to get my head around how SQL Server full text search ranks my results.

Consider the following FREETEXTTABLE search:

DECLARE @SearchTerm varchar(55) = 'Peter Alex'

SELECT ftt.[RANK], v.*
FROM FREETEXTTABLE (vMembersFTS, (Surname, FirstName, MiddleName, MemberRef, Passport), @SearchTerm) ftt
INNER JOIN vMembersFTS v ON v.ID = ftt.[KEY]
ORDER BY ftt.[RANK] DESC;

This returns the following results and rankings:

enter image description here

RANK  ID   MemberRef  Passport    FirstName   MiddleName   Surname    Salutation  
----- ---- ---------- ----------- ----------- ------------ ---------- ------------
18    2    AB-002                 Pete                     Peters     
18    9    AB-006                 George                   Alex       Mr Alex
18    13   AB-009                 Peter       David        Alex       Mr Alex
14    3    AB-003                 Peter       Alex         Jones                  

As you may be able to tell from the results posted above, the last row, although having, what I consider, a good match on both 'Peter' and 'Alex', appears with a rank of only 14 where the result in the first row has only a single match on 'Peter' (admittedly the surname is 'Peters').

This is a contrived example, but goes some way to illustrate my frustrations and lack of knowledge.

I have spent quite a bit of time researching, but I am feeling a bit out of my depth now. I'm sure that I'm doing something stupid such as searching across multiple columns.

I welcome your help and support. Thanks in advance.

Thanks,

Kaine

(BTW I am using SQL Server 2012)

Here is the SQL you can use to repeat the test yourself:

-- Create the Contacts table.
CREATE TABLE dbo.Contacts
(
    ID          int         NOT NULL PRIMARY KEY,
    FirstName   varchar(55) NULL,
    MiddleName  varchar(55) NULL,
    Surname     varchar(55) NOT NULL,
    Salutation  varchar(55) NULL,
    Passport    varchar(55) NULL
);
GO

-- Create the Members table.
CREATE TABLE dbo.Members
(
    ContactsID  int         NOT NULL PRIMARY KEY,
    MemberRef   varchar(55) NOT NULL
);
GO

-- Create the FTS view.
CREATE VIEW dbo.vMembersFTS WITH SCHEMABINDING AS
SELECT  c.ID, 
        m.MemberRef,
        ISNULL(c.Passport, '') AS Passport,
        ISNULL(c.FirstName, '') AS FirstName,
        ISNULL(c.MiddleName, '') AS MiddleName, 
        c.Surname, 
        ISNULL(c.Salutation, '') AS Salutation
FROM dbo.Contacts c 
INNER JOIN dbo.Members AS m ON m.ContactsID = c.ID 
GO

-- Create the view index for FTS.
CREATE UNIQUE CLUSTERED INDEX IX_vMembersFTS_ID ON dbo.vMembersFTS (ID);
GO

-- Create the FTS catalogue and stop-list.
CREATE FULLTEXT CATALOG ContactsFTSCatalog WITH ACCENT_SENSITIVITY = OFF;
CREATE FULLTEXT STOPLIST ContactsSL FROM SYSTEM STOPLIST;
GO

-- Create the member full-text index.
CREATE FULLTEXT INDEX ON dbo.vMembersFTS
    (Surname, Firstname, MiddleName, Salutation, MemberRef, Passport)
KEY INDEX IX_vMembersFTS_ID
ON ContactsFTSCatalog
WITH STOPLIST = ContactsSL;
GO



-- Insert some data.
INSERT INTO Contacts VALUES (1, 'John', NULL, 'Smith', NULL, NULL);
INSERT INTO Contacts VALUES (2, 'Pete', NULL, 'Peters', NULL, NULL);
INSERT INTO Contacts VALUES (3, 'Peter', 'Alex', 'Jones', NULL, NULL);
INSERT INTO Contacts VALUES (4, 'Philip', NULL, 'Smith', NULL, NULL);
INSERT INTO Contacts VALUES (5, 'Harry', NULL, 'Dukes', NULL, NULL);
INSERT INTO Contacts VALUES (6, 'Joe', NULL, 'Jones', NULL, NULL);
INSERT INTO Contacts VALUES (7, 'Alex', NULL, 'Phillips', 'Mr Phillips', NULL);
INSERT INTO Contacts VALUES (8, 'Alexander', NULL, 'Paul', 'Alex', NULL);
INSERT INTO Contacts VALUES (9, 'George', NULL, 'Alex', 'Mr Alex', NULL);
INSERT INTO Contacts VALUES (10, 'James', NULL, 'Castle', NULL, NULL);
INSERT INTO Contacts VALUES (11, 'John', NULL, 'Alexander', NULL, NULL);
INSERT INTO Contacts VALUES (12, 'Robert', NULL, 'James', 'Mr James', NULL);
INSERT INTO Contacts VALUES (13, 'Peter', 'David', 'Alex', 'Mr Alex', NULL);
INSERT INTO Members VALUES (1, 'AB-001');
INSERT INTO Members VALUES (2, 'AB-002');
INSERT INTO Members VALUES (3, 'AB-003');
INSERT INTO Members VALUES (5, 'AB-004');
INSERT INTO Members VALUES (8, 'AB-005');
INSERT INTO Members VALUES (9, 'AB-006');
INSERT INTO Members VALUES (11, 'AB-007');
INSERT INTO Members VALUES (12, 'AB-008');
INSERT INTO Members VALUES (13, 'AB-009');



-- Run the FTS query.
DECLARE @SearchTerm varchar(55) = 'Peter Alex'
SELECT ftt.[RANK], v.*
FROM FREETEXTTABLE (vMembersFTS, (Surname, FirstName, MiddleName, MemberRef, Passport), @SearchTerm) ftt
INNER JOIN vMembersFTS v ON v.ID = ftt.[KEY]
ORDER BY ftt.[RANK] DESC;

Upvotes: 2

Views: 1402

Answers (1)

Dave C
Dave C

Reputation: 7402

The rank is assigning based on the order in your query:

DECLARE @SearchTerm varchar(55) = 'Peter Alex'
SELECT ftt.[RANK], v.*
FROM FREETEXTTABLE (vMembersFTS, (Surname, FirstName, MiddleName, MemberRef, Passport), @SearchTerm) ftt
INNER JOIN vMembersFTS v ON v.ID = ftt.[KEY]
ORDER BY ftt.[RANK] DESC;

So in your case, a match on SurName trumps FirstName, and both trump MiddleName.

Your top 3 results have a rank of 18 as all three match on Surname. The last record has a rank of 14 for matching on FirstName and MiddleName but not SurName.

You can find details on the rank calculations here: https://technet.microsoft.com/en-us/library/ms142524(v=sql.105).aspx

If you want to allocate equal weight to these you can, but you'd have to use CONTAINSTABLE and not FREETEXTTABLE.

Info can be found here: https://technet.microsoft.com/en-us/library/ms189760(v=sql.105).aspx

Upvotes: 0

Related Questions