Reputation: 5
I am trying to understand why these two queries are returning different results. The first query is a sub-query in the second one. The first query is returning accurate results while the second one is not. I am primarily interested in the ACH_CODE
and ACH_PRIOR
identifiers.
The first query returns 0 and null
respectively while the second returns 0 and 1. This is a huge problem since I eventually want to select accounts that are greater than 0 for one or the other identifiers. I have double checked the system and this account should not be in my population but is ending up in the population along with other similar accounts.
Query 1:
SELECT
ACCOUNT_NUMBER, MATURITY_DATE, SNAPSHOT_DATE, ACH_CODE,
LAG(ACH_CODE, 1) OVER(ORDER BY SNAPSHOT_DATE) AS ACH_PRIOR
FROM
PORTFOLIO T1
WHERE
SNAPSHOT_DATE = (DATE*)
AND ACCOUNT_NUMBER = 81925169
Query 2:
SELECT
ACCOUNT_NUMBER, MATURITY_DATE, SNAPSHOT_DATE, ACH_CODE, ACH_PRIOR
FROM
(SELECT
ACCOUNT_NUMBER, MATURITY_DATE, SNAPSHOT_DATE, ACH_CODE,
LAG(ACH_CODE, 1) OVER(ORDER BY SNAPSHOT_DATE) AS ACH_PRIOR
FROM
PORTFOLIO T1
WHERE
SNAPSHOT_DATE = (DATE*))
WHERE
ACCOUNT_NUMBER = 81925169
The 'DATE'
in the query is actually another subquery that defines a particular date for each account number. The 'DATE'
subquery is the same for both queries. I just removed it for ease of reading.
Any insight would be greatly appreciated.
Upvotes: 0
Views: 603
Reputation: 191285
In your first query the lag()
only operates over rows matching the supplied account number. There is no previous row in the table for that account (matching the date from the subquery), so lag gets null.
In the second query it operates over all accounts. There is still no previous row for that account; but there is one for another account (which matches the date). The filter on account number is being applied after the lag has been evaluated. (Which value you get is indeterminate - it could be for any other account, or null, because you only order by date in the lag call.)
You can add a partition clause to restrict it to rows from the same account:
LAG(ACH_CODE, 1) OVER (PARTITION BY ACCOUNT_NUMBER ORDER BY SNAPSHOT_DATE)
Although, if snaphot date is unique for each account then it will always get null; and if it isn't the result is still indeterminate as you haven't specified how to break ties.
Upvotes: 2