Matthew Stott
Matthew Stott

Reputation: 397

SQL Query - slow using replace function

I have a query that is taking a long time, see below...

SELECT R.Email
      ,MAX(R.Id)
      ,MAX(R.Postcode)
FROM ParsedCandidates PC
    INNER JOIN Results R
        ON REPLACE(
             REPLACE(
               REPLACE(
                 REPLACE(R.[Resume], 'D:\documents\', '')
               ,'D:\CMT\Resumes\', '')
             , 'internal_', '')
           , 'monster_', '')
      = REPLACE(
          REPLACE(
            REPLACE(
              REPLACE(PC.[File], 'D:\documents\', '')
            ,'D:\CMT\Resumes\', '')
          ,'internal_', '')
        , 'monster_', '') 
WHERE CONTAINS(PC.ParsedCV, '"Marketing Executive"')
    AND R.Email IS NOT NULL
    AND R.Email <> ''
    AND R.Postcode IS NOT NULL
    AND R.Postcode <> ''
    AND EXISTS (SELECT 1
                FROM Candidates_Sourcing CS
                WHERE CS.Email = R.Email
                    AND CS.Email IS NOT NULL
                    AND CS.Email <> ''
               )
GROUP BY R.Email;

Both the candidates_sourcing table and the results table have many, many rows.

I know the replace function will be causing issues with sargability however I need to do it to ensure the match.

Any ideas how this can be improved

Upvotes: 2

Views: 1350

Answers (1)

MarcinJ
MarcinJ

Reputation: 3639

What you can do is create persisted columns on both tables and index those

ALTER TABLE Results ADD FixedPath AS REPLACE(
             REPLACE(
               REPLACE(
                 REPLACE([Resume], 'D:\documents\', '')
               ,'D:\CMT\Resumes\', '')
             , 'internal_', '')
           , 'monster_', '') PERSISTED

CREATE NONCLUSTERED INDEX ixResults_FixedPath ON Results (FixedPath) INCLUDE (...) WHERE (...)

INCLUDE and possibly WHERE of your index will depend on your queries.

If you don't want to alter the table, you can create an indexed view on both these tables and then join the views.

CREATE VIEW v_Results 
WITH SCHEMABINDING
AS
SELECT R.Id
--   , ... other columns ...
     , REPLACE(
                 REPLACE(
                   REPLACE(
                     REPLACE(R.[Resume], 'D:\documents\', '')
                   ,'D:\CMT\Resumes\', '')
                 , 'internal_', '')
               , 'monster_', '') AS FixedPath
  FROM dbo.Resume R
 WHERE R.Email IS NOT NULL
   AND R.Email <> ''
   AND R.Postcode IS NOT NULL
   AND R.Postcode <> ''
GO

However, the index has to be unique here.

CREATE UNIQUE CLUSTERED INDEX ux ON dbo.v_Results (FixedPath, Id);

Having created both these views you can then join

SELECT ...
  FROM v_Results R WITH (NOEXPAND)
  JOIN v_ParsedCandidates PC WITH (NOEXPAND)
    ON R.FixedPath = PC.FixedPath

NOEXPAND hint prevents SQL Server from expanding the view into the underlying query. See here.

Upvotes: 3

Related Questions