Jeph
Jeph

Reputation: 1

SQL join: selecting last record that meets a condition from the original table

I am new to SQL, so excuse any lapse of notation. A much simplified version of my problem is as follows. I have hospital admissions in table ADMISSIONS and need to collect the most recent outpatient claim of a certain type from table CLAIMS prior to the admission date:

SELECT a.ID , a.date, b.claim_date
FROM admissions as a
LEFT JOIN claims b on  (a.ID=b.ID) and (a.date>b.claim_date) 
LEFT JOIN claims c on ((a.ID=c.ID) and (a.date>c.claim_date))
     and (b.claim_date<c.claim_date or b.claim_date=c.claim_date and b.ID<c.ID)
WHERE c.ID is NULL

The problem is that for some IDs I get many records with duplicate a.date, c.claim_date values.

My problem is similar to one discussed here

SQL join: selecting the last records in a one-to-many relationship

and elaborated on here

SQL Left join: selecting the last records in a one-to-many relationship

However, there is the added wrinkle of looking only for records in CLAIMS that occur prior to a.date and I think that is causing the problem.

Update

Times are not stored, just dates, and since a patient can have multiple records on the same day, it's an issue. There is another wrinkle, which is that I only want to look at a subset of CLAIMS (let's say claims.flag=TRUE). Here's what I tried last:

SELECT a.ID , a.date, b.claim_date
FROM admissions as a
LEFT JOIN (
       select d.ID , max(d.claim_date) cdate
       from claims as d
       where d.flag=TRUE
       group by d.ID
       ) as b on  (a.ID=b.ID) and (b.claim_date < a.date) 
LEFT JOIN claims c on ((a.ID=c.ID) and (c.claim_date < a.claim_date))
     and c.flag=TRUE
     and (b.claim_date<c.claim_date or b.claim_date=c.claim_date and b.ID<c.ID)
WHERE c.ID is NULL

However, this ran for a couple of hours before aborting (typically takes about 30 mins with LIMIT 10).

Upvotes: 0

Views: 2312

Answers (1)

MeyerRJ
MeyerRJ

Reputation: 792

You may want to try using a subquery to solve this problem:

SELECT a.ID, a.date, b.claim_date
  FROM admissions as a
  LEFT JOIN claims b ON (a.ID = b.ID)
  WHERE b.claim_date = (
    SELECT MAX(c.claim_date) 
      FROM claims c 
      WHERE c.id = a.id -- Assuming that c.id is a foreign key to a.id
        AND c.claim_date < a.date  -- Claim date is less than admission date
  );

An attempt to clarify with different IDs, and using an additional subquery to account for duplicate dates:

SELECT a.ID, a.patient_id, a.date, b.claim_id, b.claim_date
  FROM admissions as a
  LEFT JOIN claims b ON (a.patient_ID = b.patient_ID)
  WHERE b.claim_id = (
    SELECT MAX(c.claim_id)  -- Max claim identifier (likely most recent if sequential)
      FROM claims c 
      WHERE c.patient_ID = a.patient_ID
                    AND c.flag = TRUE
                    AND c.claim_date = (
                        SELECT MAX(d.claim_date) 
                            FROM claims d
                            WHERE d.patient_id = c.patient_id 
                                AND c.claim_date < a.date  -- Claim date is less than admission date
                                AND d.flag = TRUE
                    )
  )
        b.flag = TRUE;

Upvotes: 1

Related Questions