Scott
Scott

Reputation: 33

SQL merging result sets on a unique column value

I have 2 similar queries which both work on the same table, and I essentially want to combine their results such that the second query supplies default values for what the first query doesn't return. I've simplified the problem as much as possible here. I'm using Oracle btw.

The table has account information in it for a number of accounts, and there are multiple entries for each account with a commit_date to tell when the account information was inserted. I need get the account info which was current for a certain date.

The queries take a list of account ids and a date.

Here is the query:

-- Select the row which was current for the accounts for the given date. (won't return anything for an account which didn't exist for the given date)
SELECT actr.*
FROM Account_Information actr 
WHERE actr.account_id in (30000316, 30000350, 30000351) 
AND actr.commit_date <= to_date( '2010-DEC-30','YYYY-MON-DD ')
AND actr.commit_date = 
(
    SELECT MAX(actrInner.commit_date) 
    FROM Account_Information actrInner 
    WHERE actrInner.account_id = actr.account_id
    AND actrInner.commit_date <= to_date( '2010-DEC-30','YYYY-MON-DD ')
) 

This looks a little ugly, but it returns a single row for each account which was current for the given date. The problem is that it doesn't return anything if the account didn't exist until after the given date.

Selecting the earliest account info for each account is trival - I don't need to supply a date for this one:

-- Select the earliest row for the accounts.
SELECT actr.*
FROM Account_Information actr 
WHERE actr.account_id in (30000316, 30000350, 30000351) 
AND actr.commit_date = 
(
    SELECT MAX(actrInner .commit_date) 
    FROM Account_Information actrInner 
    WHERE actrInner .account_id = actr.account_id
)  

But I want to merge the result sets in such a way that:

For each account, if there is account info for it in the first result set - use that. Otherwise, use the account info from the second result set.

I've researched all of the joins I can use without success. Unions almost do it but they will only merge for unique rows. I want to merge based on the account id in each row.

Sql Merging two result sets - my case is obviously more complicated than that

SQL to return a merged set of results - I might be able to adapt that technique? I'm a programmer being forced to write SQL and I can't quite follow that example well enough to see how I could modify it for what I need.

Upvotes: 3

Views: 2417

Answers (3)

Dave Johnson
Dave Johnson

Reputation: 411

I've combined the other answers. Tried it out at apex.oracle.com. Here's some explanation.

MAX(CASE WHEN commit_date <= to_date('2010-DEC-30', 'YYYY-MON-DD')) will give us the latest date not before Dec 30th, or NULL if there isn't one. Combining that with a COALESCE, we get COALESCE(MAX(CASE WHEN commit_date <= to_date('2010-DEC-30', 'YYYY-MON-DD') THEN commit_date END), MAX(commit_date)).

Now we take the account id and commit date we have and join them with the original table to get all the other fields. Here's the whole query that I came up with:

SELECT *
FROM Account_Information
JOIN (SELECT account_id,
             COALESCE(MAX(CASE WHEN commit_date <=
                                    to_date('2010-DEC-30', 'YYYY-MON-DD')
                               THEN commit_date END),
                      MAX(commit_date)) AS commit_date
      FROM Account_Information
      WHERE account_id in (30000316, 30000350, 30000351)
      GROUP BY account_id)
  USING (account_id, commit_date);

Note that if you do use USING, you have to use * instead of acrt.*.

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1270463

You can do this more directly using analytic functions:

select *
from (SELECT actr.*, max(commit_date) over (partition by account_id) as maxCommitDate,
             max(case when commit_date <= to_date( '2010-DEC-30','YYYY-MON-DD ') then commit_date end) over
                   (partition by account_id) as MaxCommitDate2
      FROM Account_Information actr 
      WHERE actr.account_id in (30000316, 30000350, 30000351)
     ) t
where (MaxCommitDate2 is not null and Commit_date = MaxCommitDate2) or
      (MaxCommitDate2 is null and Commit_Date = MaxCommitDate) 

The subquery calculates two values, the two possibilities of commit dates. The where clause then chooses the appropriate row, using the logic that you want.

Upvotes: 1

anon
anon

Reputation: 4618

The standard way to do this is with a left outer join and coalesce. That is, your overall query will look like this:

SELECT ...
FROM defaultQuery
LEFT OUTER JOIN currentQuery ON ...

If you did a SELECT *, each row would correspond to the current account data plus your defaults. With me so far?

Now, instead of SELECT *, for each column you want to return, you do a COALESCE() on matched pairs of columns:

SELECT COALESCE(currentQuery.columnA, defaultQuery.columnA) ...

This will choose the current account data if present, otherwise it will choose the default data.

Upvotes: 5

Related Questions