Ransom
Ransom

Reputation: 137

More efficient way of doing multiple joins to the same table and a "case when" in the select

At my organization clients can be enrolled in multiple programs at one time. I have a table with a list of all of the programs a client has been enrolled as unique rows in and the dates they were enrolled in that program.

Using an External join I can take any client name and a date from a table (say a table of tests that the clients have completed) and have it return all of the programs that client was in on that particular date. If a client was in multiple programs on that date it duplicates the data from that table for each program they were in on that date.

The problem I have is that I am looking for it to only return one program as their "Primary Program" for each client and date even if they were in multiple programs on that date. I have created a hierarchy for which program should be selected as their primary program and returned.

For Example:

1.)Inpatient

2.)Outpatient Clinical

3.)Outpatient Vocational

4.)Outpatient Recreational

So if a client was enrolled in Outpatient Clinical, Outpatient Vocational, Outpatient Recreational at the same time on that date it would only return "Outpatient Clinical" as the program.

My way of thinking for doing this would be to join to the table with the previous programs multiple times like this:

FROM dbo.TestTable as TestTable

LEFT OUTER JOIN dbo.PreviousPrograms as PreviousPrograms1
ON TestTable.date = PreviousPrograms1.date AND PreviousPrograms1.type = 'Inpatient'

LEFT OUTER JOIN dbo.PreviousPrograms as PreviousPrograms2
ON TestTable.date = PreviousPrograms2.date AND PreviousPrograms2.type = 'Outpatient Clinical'

LEFT OUTER JOIN dbo.PreviousPrograms as PreviousPrograms3
ON TestTable.date = PreviousPrograms3.date AND PreviousPrograms3.type = 'Outpatient Vocational'

LEFT OUTER JOIN dbo.PreviousPrograms as PreviousPrograms4
ON TestTable.date = PreviousPrograms4.date AND PreviousPrograms4.type = 'Outpatient Recreational'

and then do a condition CASE WHEN in the SELECT statement as such:

SELECT

CASE 
        WHEN PreviousPrograms1.name IS NOT NULL
            THEN PreviousPrograms1.name
        WHEN PreviousPrograms1.name IS NULL AND PreviousPrograms2.name IS NOT NULL
            THEN PreviousPrograms2.name
        WHEN PreviousPrograms1.name IS NULL AND PreviousPrograms2.name IS NULL AND PreviousPrograms3.name IS NOT NULL
            THEN PreviousPrograms3.name
        WHEN PreviousPrograms1.name IS NULL AND PreviousPrograms2.name IS NULL AND PreviousPrograms3.name IS NOT NULL AND PreviousPrograms4.name IS NOT NULL
            THEN PreviousPrograms4.name
        ELSE NULL
        END as PrimaryProgram

The bigger problem is that in my actual table there are a lot more than just four possible programs it could be and the CASE WHEN select statement and the JOINs are already cumbersome enough.

Is there a more efficient way to do either the SELECTs part or the JOIN part? Or possibly a better way to do it all together?

I'm using SQL Server 2008.

Upvotes: 1

Views: 1345

Answers (3)

Timothy Walters
Timothy Walters

Reputation: 16874

You can achieve this using sub-queries, or you could refactor it to use CTEs, take a look at the following and see if it makes sense:

DECLARE @testTable TABLE
(
    [id] INT IDENTITY(1, 1),
    [date] datetime
)
DECLARE @previousPrograms TABLE
(
    [id] INT IDENTITY(1,1),
    [date] datetime,
    [type] varchar(50)
)

INSERT INTO @testTable ([date])
SELECT '2013-08-08'
UNION ALL SELECT '2013-08-07'
UNION ALL SELECT '2013-08-06'

INSERT INTO @previousPrograms ([date], [type])
-- a sample user as an inpatient
SELECT '2013-08-08', 'Inpatient'
-- your use case of someone being enrolled in all 3 outpation programs
UNION ALL SELECT '2013-08-07', 'Outpatient Recreational'
UNION ALL SELECT '2013-08-07', 'Outpatient Clinical'
UNION ALL SELECT '2013-08-07', 'Outpatient Vocational'

-- showing our workings, this is what we'll join to
SELECT 
    PPP.[date],
    PPP.[type],
    ROW_NUMBER() OVER (PARTITION BY PPP.[date] ORDER BY PPP.[Priority]) AS [RowNumber]
FROM (
    SELECT
        [type],
        [date],
        CASE 
            WHEN [type] = 'Inpatient' THEN 1
            WHEN [type] = 'Outpatient Clinical' THEN 2
            WHEN [type] = 'Outpatient Vocational' THEN 3
            WHEN [type] = 'Outpatient Recreational' THEN 4
            ELSE 999
        END AS [Priority]
    FROM @previousPrograms
) PPP -- Previous Programs w/ Priority

SELECT
    T.[date],
    PPPO.[type]
FROM @testTable T
LEFT JOIN (
    SELECT 
        PPP.[date],
        PPP.[type],
        ROW_NUMBER() OVER (PARTITION BY PPP.[date] ORDER BY PPP.[Priority]) AS [RowNumber]
    FROM (
        SELECT
            [type],
            [date],
            CASE 
                WHEN [type] = 'Inpatient' THEN 1
                WHEN [type] = 'Outpatient Clinical' THEN 2
                WHEN [type] = 'Outpatient Vocational' THEN 3
                WHEN [type] = 'Outpatient Recreational' THEN 4
                ELSE 999
            END AS [Priority]
        FROM @previousPrograms
    ) PPP -- Previous Programs w/ Priority
) PPPO -- Previous Programs w/ Priority + Order
    ON T.[date] = PPPO.[date] AND PPPO.[RowNumber] = 1

Basically we have our deepest sub-select giving all PreviousPrograms a priority based on type, then our wrapping sub-select gives them row numbers per date so we can select only the ones with a row number of 1.

I am guessing you would need to include a UR Number or some other patient identifier, simply add that as an output to both sub-selects and change the join.

Upvotes: 2

Bohemian
Bohemian

Reputation: 425043

You can simplify (replace) your CASE by using COALESCE() instead:

SELECT
  COALESCE(PreviousPrograms1.name, PreviousPrograms2.name,
    PreviousPrograms3.name, PreviousPrograms4.name) AS PreviousProgram

COALESCE() returns the first non-null value.

Due to your design, you still need the JOINs, but it would be much easier to read if you used very short aliases, for example PP1 instead of PreviousPrograms1 - it's just a lot less code noise.

Upvotes: 3

Luis Carvalho
Luis Carvalho

Reputation: 101

You can simplify the Join by using a bridge table containing all the program types and their priority (my sql server syntax is a bit rusty):

create table BridgeTable (
    programType varchar(30),
    programPriority smallint
);

This table will hold all the program types and the program priority will reflect the priority you've specified in your question.

As for the part of the case, that will depend on the number of records involved. One of the tricks that I usually do is this (assuming programPriority is a number between 10 and 99 and no type can have more than 30 bytes, because I'm being lazy):

Select patient, date, 
                substr( min(cast(BridgeTable.programPriority as varchar) || PreviousPrograms.type), 3, 30) 
From dbo.TestTable as TestTable
Inner Join dbo.BridgeTable as BridgeTable
Left Outer Join dbo.PreviousPrograms as PreviousPrograms 
                on PreviousPrograms.type = BridgeTable.programType 
                   and TestTable.date = PreviousPrograms.date
Group by patient, date

Upvotes: 2

Related Questions