David
David

Reputation: 459

Finding a min() date for one column and then using this to join with other tables that have a date LESS than this date

In short, I have two tables:

(1) pharmacy_claims (columns: user_id, date_service, claim_id, record_id, prescription)

(2) medical_claims (columns: user_id, date_service, provider, npi, cost)

I want to find user_id's in (1) that have a certain prescription value, find their earliest date_service (e.g. min(date_service)) and then use these user_id's with their earliest date of service as a cohort to pull all of their associated data from (2). Basically I want to find all of their medical_claims data PRIOR to the first time they were prescribed a given prescription in pharmacy_claims.

pharmacy_claims looks something like this:

user_id | prescription | date_service
1           a             2018-05-01
1           a             2018-02-11
1           a             2019-10-11
1           b             2018-07-12
2           a             2019-01-02
2           a             2019-03-10
2           c             2018-04-11
3           c             2019-05-26

So for instance, if I was interested in prescription = 'a', I would only want user_id 1 and 2 returned, with dates 2018-02-11 and 2019-01-02, respectively. Then I would want to pull user_id 1 and 2 from the medical_claims, and get all of their data PRIOR to these respective dates.

The way I tried to go about this was to build out a temp table in the pharmacy_claims table to query the user_id's that have a given medication, and then left join this back to the table to create a cohort of user_id's with a date_service

Here's what I did:

(1) Pulled all of the relevant data from the main pharmacy claims table:

CREATE TABLE user.temp_pharmacy_claims AS
SELECT user_id, claim_id, record_id, date_service
FROM dw.pharmacyclaims 
WHERE date_service between '2018-01-01' and '2019-08-31'

This results in ~50,000 user_id's

(2) Created a table with just the user_id's a min(date_service):

CREATE TABLE user.temp_pharmacy_claims_index AS
SELECT distinct user_id, min(date_service) AS Min_Date
FROM user.temp_pharmacy_claims
GROUP BY 1

(3) Created a final table (to get the desired cohort):

CREATE TABLE user.temp_pharmacy_claims_final_index AS
SELECT a.userid
FROM user.temp_pharmacy_claims  a
LEFT JOIN user.temp_pharmacy_claims_index b
ON a.user = b.user
WHERE  a.date_service < Min_Date

However, this gets me 0 results when there should be a few thousand. Is this set up correctly? It's probably not the most efficient approach, but it looks sound to me, so not sure what's going on.

Upvotes: 0

Views: 39

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269603

I think you just want a correlated subquery:

select mc.*
from medical_claims mc
where mc.date_service < (select min(pc.date)
                         from pharmacy_claims pc
                         where pc.user_id = mc.user_id and 
                               pc.prescription = ?
                        );

Upvotes: 1

Related Questions