MLorenzen
MLorenzen

Reputation: 69

SQL - How to return only rows which have a value which appears more than once?

I'm fairly new to SQL server...but I have a problem. I've written the below query, which works just fine for displaying all visits (encounters) and lab results for a patient in 2012.

SELECT pat.patient_id, pat.last_name, Pat.first_name, pat.ethnic_group,
pat.race, pl.icdcode, enc.encounter_date, lab.test_performed, lab.result_value,
lab.result_units
FROM dbo.dem_patient pat
RIGHT JOIN dbo.med_problemlist as pl
ON pat.patient_id = pl.patient_id
JOIN dbo.enc_encounter as enc
ON pat.patient_id = enc.patient_id
JOIN dbo.med_labresult as lab
ON pat.patient_id = lab.patient_id
WHERE pl.icdcode like '401.9%' AND
enc.encounter_date >= '01/01/2012' AND enc.encounter_date <= '12/31/2012' AND
pat.last_name != 'test' AND (lab.test_performed = 'HgbA1c' OR 
lab.test_performed = 'Hemoglobin A1c')
GROUP BY pat.patient_id, pat.last_name, Pat.first_name, pat.ethnic_group,
pat.race, pl.icdcode, enc.encounter_date, lab.test_performed, lab.result_value,
lab.result_units
ORDER BY race, ethnic_group

The result from the above query looks something like this:

patient_id  last_name  first_name  ethnic_group  race  icdcode  encounter_date lab_result
0           smith      john        HIS           WHI   401.90   01/01/2012     9.1
0           smith      john        HIS           WHI   401.90   10/12/2012     9.2
0           smith      john        HIS           WHI   401.90   11/15/2012     9.7
5           doe        jane        NSP           BLK   401.90   01/20/2012     11
6           davids     david       NSP           BLK   401.90   04/25/2012     5.9
6           davids     david       NSP           BLK   401.90   05/04/2012     6.4

However, I now need to return only those patients who have more than one visit (encounter) between 1/1/2012 and 12/31/2012. It would look like this (from the above result)

patient_id  last_name  first_name  ethnic_group  race  icdcode  encounter_date lab_result
0           smith      john        HIS           WHI   401.90   01/01/2012     9.1
6           davids     david       NSP           BLK   401.90   04/25/2012     5.9

Can anyone help me with this?

UPDATE: So, using the suggestions below, I've added HAVING COUNT(*) > 1 to the query. However, that gives me some results I don't quite understand.

Results of query without the HAVING COUNT(*) > 1 statement:

1001131     BLA 401.9   48:48.3 Hemoglobin A1c  9.9
1001131     BLA 401.9   00:00.0 Hemoglobin A1c  9.9
1001131     BLA 401.9   44:44.3 Hemoglobin A1c  9.9
1001648     BLA 401.9   50:15.6 Hemoglobin A1c  6.5
1001648     BLA 401.9   16:15.1 Hemoglobin A1c  6.5
1001648     BLA 401.9   28:04.5 Hemoglobin A1c  6.5
1001648     BLA 401.9   17:00.2 Hemoglobin A1c  6.5
1002648     BLA 401.9   17:47.8 Hemoglobin A1c  6.1
1002648     BLA 401.9   52:38.0 Hemoglobin A1c  6.1
1002648     BLA 401.9   43:47.3 Hemoglobin A1c  6.1
1002648     BLA 401.9   02:55.9 Hemoglobin A1c  6.1
1002715     BLA 401.9   37:52.4 Hemoglobin A1c  13.1
1002715     BLA 401.9   44:41.0 Hemoglobin A1c  12.6
1002715     BLA 401.9   44:41.0 Hemoglobin A1c  13.1
1002715     BLA 401.9   37:52.4 Hemoglobin A1c  10.7
1002715     BLA 401.9   37:52.4 Hemoglobin A1c  12.6
1002715     BLA 401.9   00:57.9 Hemoglobin A1c  10.7
1002715     BLA 401.9   00:57.9 Hemoglobin A1c  12.6
1002715     BLA 401.9   00:57.9 Hemoglobin A1c  13.1
1002715     BLA 401.9   38:17.6 Hemoglobin A1c  10.7
1002715     BLA 401.9   38:17.6 Hemoglobin A1c  12.6
1002715     BLA 401.9   38:17.6 Hemoglobin A1c  13.1
1002715     BLA 401.9   44:41.0 Hemoglobin A1c  10.7
1002893     BLA 401.9   33:45.4 Hemoglobin A1c  5.9
1002893     BLA 401.9   22:58.5 Hemoglobin A1c  5.9
1002893     BLA 401.9   00:00.0 Hemoglobin A1c  5.9
1002893     BLA 401.9   55:11.3 Hemoglobin A1c  5.9
1003195     BLA 401.9   34:08.7 Hemoglobin A1c  6.4
1003195     BLA 401.9   50:03.6 Hemoglobin A1c  6.4
1003195     BLA 401.9   08:36.8 Hemoglobin A1c  6.4
1003195     BLA 401.9   33:56.6 Hemoglobin A1c  6.4
1003195     BLA 401.9   34:04.3 Hemoglobin A1c  6.4

Results of query with the HAVING statement:

1001937     BLA 401.9   11/14/12Hemoglobin A1c  6.3
1001937     BLA 401.9   8/14/12 Hemoglobin A1c  6.3
1001937     BLA 401.9   5/14/12 Hemoglobin A1c  6.3
1001937     BLA 401.9   2/14/12 Hemoglobin A1c  6.3
1001937     BLA 401.9   11/14/12Hemoglobin A1c  6.3
1002034     BLA 401.9   1/10/12 HEMOGLOBIN A1c  7.7
1002034     BLA 401.9   6/8/12  HEMOGLOBIN A1c  7.7
1002034     BLA 401.9   9/7/12  HEMOGLOBIN A1c  7.7
1002034     BLA 401.9   10/15/12HEMOGLOBIN A1c  7.7
1002648     BLA 401.9   2/22/12 Hemoglobin A1c  6.1
1002648     BLA 401.9   7/2/12  Hemoglobin A1c  6.1
1002648     BLA 401.9   10/2/12 Hemoglobin A1c  6.1
1002648     BLA 401.9   2/22/12 Hemoglobin A1c  6.1
1020923     BLA 401.9   1/30/12 HEMOGLOBIN A1c  10
1020923     BLA 401.9   1/30/12 Hemoglobin A1c  11.8
1020923     BLA 401.9   4/17/12 Hemoglobin A1c  11.8
1020923     BLA 401.9   5/16/12 HEMOGLOBIN A1c  10
1020923     BLA 401.9   5/16/12 Hemoglobin A1c  11.8
1020923     BLA 401.9   9/13/12 HEMOGLOBIN A1c  10
1020923     BLA 401.9   9/13/12 Hemoglobin A1c  9.3
1020923     BLA 401.9   10/1/12 HEMOGLOBIN A1c  10
1020923     BLA 401.9   10/1/12 Hemoglobin A1c  11.8

Upvotes: 2

Views: 2971

Answers (5)

Oleksandr Fedorenko
Oleksandr Fedorenko

Reputation: 16904

In your result lab_result and encounter_date have different values. Request returns minimum values for grouping your data

SELECT pat.patient_id, pat.last_name, Pat.first_name, pat.ethnic_group,
       pat.race, pl.icdcode, MIN(enc.encounter_date), MIN(lab.result_value)
FROM dbo.dem_patient pat RIGHT JOIN dbo.med_problemlist as pl ON pat.patient_id = pl.patient_id
                               JOIN dbo.enc_encounter as enc ON pat.patient_id = enc.patient_id
                               JOIN dbo.med_labresult as lab ON pat.patient_id = lab.patient_id
WHERE pl.icdcode like '401.9%' AND enc.encounter_date >= '01/01/2012' AND enc.encounter_date <= '12/31/2012' AND
      pat.last_name != 'test' AND (lab.test_performed = 'HgbA1c' OR lab.test_performed = 'Hemoglobin A1c')
GROUP BY pat.patient_id, pat.last_name, Pat.first_name, pat.ethnic_group,
         pat.race, pl.icdcode
HAVING COUNT(*) > 1
ORDER BY race, ethnic_group

Upvotes: 0

HLGEM
HLGEM

Reputation: 96650

;WITH PatientData (patient_id,last_name, first_name, ethnic_group,
race, icdcode, encounter_date, test_performed, result_value,
result_units )
AS
(
SELECT pat.patient_id, pat.last_name, Pat.first_name, pat.ethnic_group,
pat.race, pl.icdcode, enc.encounter_date, lab.test_performed, lab.result_value,
lab.result_units
FROM dbo.dem_patient pat
RIGHT JOIN dbo.med_problemlist as pl
ON pat.patient_id = pl.patient_id
JOIN dbo.enc_encounter as enc
ON pat.patient_id = enc.patient_id
JOIN dbo.med_labresult as lab
ON pat.patient_id = lab.patient_id
WHERE pl.icdcode like '401.9%' AND
enc.encounter_date >= '01/01/2012' AND enc.encounter_date <= '12/31/2012' AND
pat.last_name != 'test' AND (lab.test_performed = 'HgbA1c' OR 
lab.test_performed = 'Hemoglobin A1c')
GROUP BY pat.patient_id, pat.last_name, Pat.first_name, pat.ethnic_group,
pat.race, pl.icdcode, enc.encounter_date, lab.test_performed, lab.result_value,
lab.result_units
)

SELECT patient_id,last_name, first_name, ethnic_group,
race, icdcode, encounter_date, test_performed, result_value,
result_units
FROM PatientData pd1 
WHERE EXISTS (SELECT patient_id 
                FROM PatientData pd2 
                WHERE p2.patient_id  = pd1.patient_id  
                GROUP BY patient_id 
                HAVING COUNT(*)>1 )
ORDER BY pd1.race, pd1.ethnic_group

What this does is create a temporary named result set called a common table expression (CTE). This has all the data in it. Then the select picks out only those records in the CTE where the patientid is present more than once.

Upvotes: 0

SunilK
SunilK

Reputation: 157

Off the top of my head, add another join condition to an inline view where you grab min encounterdate from a set where the count of encounters is > 1. The join condition would be enc.encounterdate = inlineview.encounterdate.

The example result you posted looks like its looking for first encounter, so use min, if you want last encounter, then use max.

Upvotes: 0

Simon
Simon

Reputation: 1605

You query could look like this :

SELECT pat.patient_id, pat.last_name, Pat.first_name, pat.ethnic_group,
       pat.race, pl.icdcode, enc.encounter_date, lab.test_performed, lab.result_value,
       lab.result_units
FROM dbo.dem_patient pat
     RIGHT JOIN dbo.med_problemlist as pl
      ON pat.patient_id = pl.patient_id
     JOIN dbo.enc_encounter as enc
      ON pat.patient_id = enc.patient_id
     JOIN dbo.med_labresult as lab
      ON pat.patient_id = lab.patient_id
WHERE pl.icdcode like '401.9%' AND
       enc.encounter_date >= '01/01/2012' AND enc.encounter_date <= '12/31/2012' AND
       pat.last_name != 'test' AND (lab.test_performed = 'HgbA1c' OR 
       lab.test_performed = 'Hemoglobin A1c')
GROUP BY pat.patient_id, pat.last_name, Pat.first_name, pat.ethnic_group,
         pat.race, pl.icdcode, enc.encounter_date, lab.test_performed,lab.result_value,
          lab.result_units
HAVING COUNT(*)>1

Upvotes: 0

Techmonk
Techmonk

Reputation: 1469

remove the date from selection fields and then use

Select FIELDS where condition group by FIELDS  having count(*) > 1

Upvotes: 1

Related Questions