Steven Ball
Steven Ball

Reputation: 481

Multiple joins to single table in SQL Server query throwing off counts

I am writing a stored procedure in SQL Server Management Studio 2005 to return a list of states and policy counts for two different time periods, month to date and year to date. I have created a couple of views to gather the required data and a stored procedure for use in a Reporting Services report.

Below is my stored procedure:

SELECT DISTINCT
    S.[State],
    COUNT(HP_MTD.PolicyID) AS PolicyCount_MTD,
    COUNT(HP_YTD.PolicyID) AS PolicyCount_YTD
FROM tblStates S
    LEFT OUTER JOIN vwHospitalPolicies HP_MTD ON S.[State] = HP.[State]
        AND HP.CreatedDate BETWEEN DATEADD(MONTH, -1, GETDATE()) AND GETDATE()
    LEFT OUTER JOIN vwHospitalPolicies HP_YTD ON S.[State] = HP.[State]
        AND HP.CreatedDate BETWEEN DATEADD(YEAR, -1, GETDATE()) AND GETDATE()
GROUP BY S.[State]
ORDER BY S.[State] ASC

The problem I am running into is my counts are bloating when a second LEFT OUTER JOIN is added, even the COUNT() that isn't referencing the second join. I need a left join since not all states will have policies for the given period, but they should still appear on the report.

Any suggestions would be greatly appreciated!

Upvotes: 1

Views: 1323

Answers (2)

Clockwork-Muse
Clockwork-Muse

Reputation: 13096

I prefer CTEs for this sort of work - they're basically a sort of inline view.

Your actual problem is that some policies are being counted twice - once for the month-to-date, and once for the year-to-date - in both count columns. So, if you have source tables like this:

year-to-date
state   policy
==================
1       1
1       2
1       3
2       4

month-to-date
state   policy
==================
1       1
1       2

The result table for the JOINs (before COUNT is assesed) looks like this:

temp
state  monthPolicy  yearPolicy
================================
1      1            1
1      1            2
1      1            3
1      2            1
1      2            2
1      2            3
2      -            4

Clearly not what you want. Often, the solution is to use a CTE or other table reference to present summed-up records for the final join. Something like this:

WITH PoliciesMonthToDate (state, count) as (
                          SELECT state, COUNT(*)
                          FROM vwHospitalPolicies
                          WHERE createdDate >= DATEADD(MONTH, -1, GETDATE())
                          AND createdDate < DATEADD(DAY, 1, GETDATE())
                          GROUP BY state),
     PoliciesYearToDate (state, count) as (
                         SELECT state, COUNT(*)
                         FROM vwHospitalPolicies
                         WHERE createdDate >= DATEADD(YEAR, -1, GETDATE())
                         AND createdDate < DATEADD(DAY, 1, GETDATE())
                         GROUP BY state)
SELECT a.state, COALESCE(b.count, 0) as policy_count_mtd, 
                COALESCE(c.count, 0) as policy_count_ytd
FROM tblStates a
LEFT JOIN PoliciesMonthToDate b
ON b.state = a.state
LEFT JOIN PoliciesYearToDate c
ON c.state = a.state
ORDER BY a.state

(minor nitpick - don't use prefixes for tables and views. It's noise, and if for some reason you switch between them, it would require a code change. That, or the name would then be misleading)

Upvotes: 0

weenoid
weenoid

Reputation: 1186

It sounds like you need:

COUNT(DISTINCT HP_MTD.PolicyID) AS PolicyCount_MTD,
COUNT(DISTINCT HP_YTD.PolicyID) AS PolicyCount_YTD

instead of:

COUNT(HP_MTD.PolicyID) AS PolicyCount_MTD,
COUNT(HP_YTD.PolicyID) AS PolicyCount_YTD

Your original query is including the number of matching rows in the second join. Adding a DISTINCT clause inside the COUNT limits it to unique occurrences of the PolicyID.

Upvotes: 2

Related Questions