Reputation: 1062
I'm trying to use the LAG function in Oracle to find the previous registration value for donors.
To see my data, I started with this query to find all registrations for the particular donor:
select registration_id, registration_date from registration r where r.person_id=52503290 order by r.registration_date desc;
Then I used the LAG
function to return the previous value along with the most recent value:
select registration_id as reg_id, registration_date as reg_date,
lag(registration_date,1) over (order by registration_date) as prev_reg_date
from registration
where person_id=52503290
order by registration_date desc;
And the results are as expected:
So I thought I should be good to place the LAG
function within the main query to get the previous value but for some reason, the previous value returns NULL or no value at all.
SELECT
P.Person_Id AS Person_ID,
R.Registration_Date AS Drive_Date,
LAG(R.Registration_Date,1) OVER (ORDER BY R.REGISTRATION_DATE) AS Previous_Drive_Date,
P.Abo AS Blood_Type,
DT.Description AS Donation_Type
FROM
Person P
JOIN Registration R ON P.Person_Id = R.Person_Id AND P.First_Name <> 'Pooled' AND P.First_Name <> 'IMPORT'
LEFT OUTER JOIN Drives DR ON R.Drive_Id = DR.Drive_Id AND DR.Group_Id <> 24999
LEFT OUTER JOIN Branches B ON R.Branch_Id = B.Branch_Id
LEFT OUTER JOIN Donor_Group DG on DR.Group_Id = DG.Group_Id
LEFT OUTER JOIN Donation_Type DT ON R.Donation_Type_Id = DT.DONATION_TYPE_ID
WHERE
TRUNC(R.Registration_Date) = TRUNC(SYSDATE)-1
AND R.Person_Id=52503290
ORDER BY
R.Registration_Date DESC;
Here is the result set:
Any suggestions on what I am missing here? Or why this query isn't returning the values expected?
Based on @Alex Poole's suggestions, I changed the query to look like:
SELECT * FROM (
SELECT
P.Person_Id AS Person_ID,
R.Registration_Date AS Drive_Date,
LAG(R.Registration_Date,1) OVER (partition by p.person_id ORDER BY r.registration_date) AS Previous_Drive_Date,
P.Abo AS Blood_Type,
DT.Description AS Donation_Type
FROM
Person P
JOIN Registration R ON P.Person_Id = R.Person_Id AND P.First_Name <> 'Pooled' AND P.First_Name <> 'IMPORT'
LEFT OUTER JOIN Drives DR ON R.Drive_Id = DR.Drive_Id AND DR.Group_Id <> 24999
LEFT OUTER JOIN Branches B ON R.Branch_Id = B.Branch_Id
LEFT OUTER JOIN Donor_Group DG on DR.Group_Id = DG.Group_Id
LEFT OUTER JOIN Donation_Type DT ON R.Donation_Type_Id = DT.DONATION_TYPE_ID
--WHERE R.Person_Id=52503290
)
WHERE TRUNC(Drive_Date) = TRUNC(SYSDATE)-1
ORDER BY Drive_Date DESC;
It takes about 85 seconds to pull back the first 30 rows:
My original query (before the LAG
function was added) took about 2 seconds to pull back approximately 2100 records. But it was nothing but a SELECT with a couple of JOINS and one item in the WHERE clause.
Looking at the record counts, Person has almost 5.5 million records and Registration has 9.1 million records.
Upvotes: 1
Views: 6995
Reputation: 191245
The lag is only applied within the rows that match the where
clause filter, so you would only see the previous value if that was also yesterday.
You can apply the lag in a subquery, and then filter in an outer query:
SELECT * FROM (
SELECT
P.Person_Id AS Person_ID,
R.Registration_Date AS Drive_Date,
LAG(R.Registration_Date,1) OVER (ORDER BY R.REGISTRATION_DATE) AS Previous_Drive_Date,
P.Abo AS Blood_Type,
DT.Description AS Donation_Type
FROM
Person P
JOIN Registration R ON P.Person_Id = R.Person_Id AND P.First_Name <> 'Pooled' AND P.First_Name <> 'IMPORT'
LEFT OUTER JOIN Drives DR ON R.Drive_Id = DR.Drive_Id AND DR.Group_Id <> 24999
LEFT OUTER JOIN Branches B ON R.Branch_Id = B.Branch_Id
LEFT OUTER JOIN Donor_Group DG on DR.Group_Id = DG.Group_Id
LEFT OUTER JOIN Donation_Type DT ON R.Donation_Type_Id = DT.DONATION_TYPE_ID
WHERE R.Person_Id=52503290
)
WHERE TRUNC(Drive_Date) = TRUNC(SYSDATE)-1
ORDER BY Drive_Date DESC;
Upvotes: 2