Reputation: 1
I am new to SQL, so excuse any lapse of notation. A much simplified version of my problem is as follows. I have hospital admissions in table ADMISSIONS and need to collect the most recent outpatient claim of a certain type from table CLAIMS prior to the admission date:
SELECT a.ID , a.date, b.claim_date
FROM admissions as a
LEFT JOIN claims b on (a.ID=b.ID) and (a.date>b.claim_date)
LEFT JOIN claims c on ((a.ID=c.ID) and (a.date>c.claim_date))
and (b.claim_date<c.claim_date or b.claim_date=c.claim_date and b.ID<c.ID)
WHERE c.ID is NULL
The problem is that for some IDs I get many records with duplicate a.date, c.claim_date values.
My problem is similar to one discussed here
SQL join: selecting the last records in a one-to-many relationship
and elaborated on here
SQL Left join: selecting the last records in a one-to-many relationship
However, there is the added wrinkle of looking only for records in CLAIMS that occur prior to a.date and I think that is causing the problem.
Update
Times are not stored, just dates, and since a patient can have multiple records on the same day, it's an issue. There is another wrinkle, which is that I only want to look at a subset of CLAIMS (let's say claims.flag=TRUE). Here's what I tried last:
SELECT a.ID , a.date, b.claim_date
FROM admissions as a
LEFT JOIN (
select d.ID , max(d.claim_date) cdate
from claims as d
where d.flag=TRUE
group by d.ID
) as b on (a.ID=b.ID) and (b.claim_date < a.date)
LEFT JOIN claims c on ((a.ID=c.ID) and (c.claim_date < a.claim_date))
and c.flag=TRUE
and (b.claim_date<c.claim_date or b.claim_date=c.claim_date and b.ID<c.ID)
WHERE c.ID is NULL
However, this ran for a couple of hours before aborting (typically takes about 30 mins with LIMIT 10).
Upvotes: 0
Views: 2312
Reputation: 792
You may want to try using a subquery to solve this problem:
SELECT a.ID, a.date, b.claim_date
FROM admissions as a
LEFT JOIN claims b ON (a.ID = b.ID)
WHERE b.claim_date = (
SELECT MAX(c.claim_date)
FROM claims c
WHERE c.id = a.id -- Assuming that c.id is a foreign key to a.id
AND c.claim_date < a.date -- Claim date is less than admission date
);
An attempt to clarify with different IDs, and using an additional subquery to account for duplicate dates:
SELECT a.ID, a.patient_id, a.date, b.claim_id, b.claim_date
FROM admissions as a
LEFT JOIN claims b ON (a.patient_ID = b.patient_ID)
WHERE b.claim_id = (
SELECT MAX(c.claim_id) -- Max claim identifier (likely most recent if sequential)
FROM claims c
WHERE c.patient_ID = a.patient_ID
AND c.flag = TRUE
AND c.claim_date = (
SELECT MAX(d.claim_date)
FROM claims d
WHERE d.patient_id = c.patient_id
AND c.claim_date < a.date -- Claim date is less than admission date
AND d.flag = TRUE
)
)
b.flag = TRUE;
Upvotes: 1