How to get DISTINCT rows COUNTed with different conditions in a single SELECT statement?

Question

I have a table with orders from different users. Each user (identified by an e-mail address) can have any number of orders. I don't have an extra table with users, just a plain OrderData table. What I want sounds simple:

Total number of users
Number of XYZ users
Number of Non-XYZ users

where 1. is the sum of 2. and 3. "XYZ" users are defined by their e-mail address ending in "@xyz.com".

I want to have a single query, returning the three values in three columns. What I currently have is:

SELECT 
    (
    SELECT COUNT(DISTINCT User_EmailAddress)
    FROM OrderData
    WHERE User_EmailAddress IS NOT NULL
    AND RequestTime >= @RequestTimeFrom
    AND RequestTime  < @RequestTimeTo
    ) AS [Total Users],
    (
    SELECT COUNT(DISTINCT User_EmailAddress)
    FROM OrderData
    WHERE User_EmailAddress IS NOT NULL
    AND User_EmailAddress LIKE '%@xyz.com'
    AND RequestTime >= @RequestTimeFrom
    AND RequestTime  < @RequestTimeTo
    ) AS [XYZUsers],
    (
    SELECT COUNT(DISTINCT User_EmailAddress)
    FROM OrderData
    WHERE User_EmailAddress IS NOT NULL
    AND User_EmailAddress NOT LIKE '%@xyz.com'
    AND RequestTime >= @RequestTimeFrom
    AND RequestTime  < @RequestTimeTo
    ) AS [Non-XYZ Users]

It returns the correct result set:

Total Users | XYZ Users | Non-XYZ Users
------------+-----------+--------------
        123 |        23 |           100

Is there a nicer way to write this query, instead of basically having three similar queries with the same code?

Aaron Bertrand · Accepted Answer

Here is a solution that reduces redundant code and also avoid problems with multiple scans of the data. Since all you really care about is the counts, you don't even need to return any of the actual data, and you can collapse almost all of the logic into a single CASE expression:

;WITH x AS 
(
  SELECT isxyz = CASE WHEN User_EmailAddress LIKE '%@xyz.com' THEN 1 ELSE 0 END
  FROM dbo.OrderData
  WHERE User_EmailAddress IS NOT NULL
    AND RequestTime >= @RequestTimeFrom
    AND RequestTime  < @RequestTimeTo
    GROUP BY User_EmailAddress
)
SELECT 
  TotalUsers      = COUNT(isxyz),
  XYZUsers        = SUM(isxyz),
  [Non-XYZ Users] = COUNT(NULLIF(isxyz,1))
FROM x;

The trick is to use the aggregates to your advantage. Since the 1/0 output is not a true BIT column, you can SUM it to get the count of 1s. And the opposite can be achieved by changing the 1s to NULL, as NULLs are ignored by COUNT.

How to get DISTINCT rows COUNTed with different conditions in a single SELECT statement?

Answers (2)

Related Questions