Stoleg
Stoleg

Reputation: 9290

Fastest string comparison in sql case

I have a straight forward SELECT query. Now i need to change values in one column based on another varchar column.

Here an example:

SELECT
  Company_Name,

  CASE 
    WHEN Size = 'Large' THEN Company_Name 
    ELSE 'Small company'
  end,

  CASE 
    WHEN left (Size,1) = 'L' THEN Company_Name 
    ELSE 'Small company'
  end,

  CASE 
    WHEN ASCII(Size) = 76 THEN Company_Name 
    ELSE 'Small company'
  end,

  Size
FROM MyTable

Here I have 3 CASEs. Which one is faster? This query is for data extract and it has no WHERE clause.

Thanks a lot!

UPDATE

Q: Why I cannot do it by running against a table?

A:

  1. Running a query on a given dataset does not prove solution is faster for any data set.
  2. I do not have permissions to reset DB, so test runs are comparable: DBCC DROPCLEANBUFFERS

Upvotes: 0

Views: 2842

Answers (1)

RichardCL
RichardCL

Reputation: 1482

There won't be any significant difference between the three options for normal volume of data. I tested this on a table with 10 million records.

CREATE TABLE dbo.Sizes
(
    ID INT NOT NULL PRIMARY KEY IDENTITY(1,1),
    Size NVARCHAR(15) NOT NULL
)

DECLARE @i INT;
SET @i = 0;
WHILE @i < 10000000
BEGIN
    INSERT dbo.Sizes (Size)
        VALUES (N'Small'), (N'Medium'), (N'Large');
    SET @i = @i + 3;
END

First test.

SET STATISTICS TIME ON;
GO

SELECT ID, 
        CASE
            WHEN Size = N'Large' THEN CAST(ID AS NVARCHAR(25))
            ELSE N'Small company'
        END AS 'SizeDescription',
        Size
    FROM dbo.Sizes;
GO

SET STATISTICS TIME OFF;
GO

Results for first test.

SQL Server parse and compile time: CPU time = 0 ms, elapsed time = 1 ms.

(10000002 row(s) affected)

SQL Server Execution Times: CPU time = 7422 ms, elapsed time = 64791 ms. SQL Server parse and compile time: CPU time = 0 ms, elapsed time = 0 ms.

Second test.

SET STATISTICS TIME ON;
GO

SELECT ID, 
        CASE
            WHEN LEFT(Size,1) = N'L' THEN CAST(ID AS NVARCHAR(25))
            ELSE N'Small company'
        END AS 'SizeDescription',
        Size
    FROM dbo.Sizes;
GO

SET STATISTICS TIME OFF;
GO

Results from second test.

SQL Server parse and compile time: CPU time = 0 ms, elapsed time = 1 ms.

(10000002 row(s) affected)

SQL Server Execution Times: CPU time = 8203 ms, elapsed time = 69081 ms. SQL Server parse and compile time: CPU time = 0 ms, elapsed time = 0 ms.

Third test.

SET STATISTICS TIME ON;
GO

SELECT ID, 
        CASE
            WHEN ASCII(Size) = 76 THEN CAST(ID AS NVARCHAR(25))
            ELSE N'Small company'
        END AS 'SizeDescription',
        Size
    FROM dbo.Sizes;
GO

SET STATISTICS TIME OFF;
GO

Results from third test.

SQL Server parse and compile time: CPU time = 0 ms, elapsed time = 1 ms.

(10000002 row(s) affected)

SQL Server Execution Times: CPU time = 8031 ms, elapsed time = 69460 ms. SQL Server parse and compile time: CPU time = 0 ms, elapsed time = 0 ms.

The CPU time is the critical measurement and this is much the same for the three tests (between 7.4 and 8.2 seconds).

If you needed to speed up the query, you could reduce the size of the Size field by replacing it with a single character or tinyint. This would reduce the overall size of the table and speed up I/O.

You'd also need to consider the time taken to return the results which likely to be the limiting factor. And indeed locking on the table--are other users updating the table at the same time as running the queries?

Upvotes: 1

Related Questions