Take2
Take2

Reputation: 159

Select IDs for only three event types

I have an events table. It contains several types of events for several subject_ids and hadm_ids. For example, it contains Heart Rhythm events under the 'eventlabel' HRy.

I want to select subject_ids and hadm_ids which contain, for the eventlabel HRy, either value 'SR' or 'SB' or 'ST'. Subjects with other values for eventlabel HRy are discardaded. This means that if a patient has eventlabel HRy other than 'SR', 'SB' or 'ST', they are discarded. If they have all 'SR' or all 'SB' or all 'ST', it's all good. If they have a mixture of those 3, it's all good. There are other types of events (for example BP), but they don't matter.

Here is an example table with expected output:

drop table testevents cascade;
create table testevents(
hadm_id int not null,
subject_id int not null,
eventtype int not null,
eventlabel char(30) not null,
value char(360) not null,
valuenum int
);

insert into testevents(hadm_id, subject_id, eventtype, eventlabel, value, valuenum)
values
    (1, 1, 220048, 'HRy', 'SR', null),
    (1, 1, 220048, 'HRy', 'SR', null),
    (1, 1, 220048, 'HRy', 'SR', null),
    (1, 1, 220048, 'HRy', 'SR', null),
    (1, 1, 220048, 'HRy', 'SR', null),
    (1, 1, 220048, 'HRy', 'SR', null),   --all good here: SR all the time

    (2, 2, 220048, 'HRy', 'SR', null),
    (2, 2, 220048, 'HRy', 'SR', null),
    (2, 2, 220048, 'HRy', 'SR', null),
    (2, 2, 220048, 'HRy', 'ST', null),
    (2, 2, 220048, 'HRy', 'SR', null),
    (2, 2, 220048, 'HRy', 'ST', null),  --all good here: either SR or ST, both allowed

    (3, 3, 220048, 'HRy', 'ST', null),
    (3, 3, 220048, 'HRy', 'ST', null),
    (3, 3, 220048, 'HRy', 'ST', null),
    (3, 3, 220048, 'HRy', 'ST', null),
    (3, 3, 220048, 'HRy', 'ST', null),
    (3, 3, 220048, 'HRy', 'ST', null),   --all good here: ST all the time
    (3, 3, 4053, 'BP', '87', 87),        --it contains another type of event, which doesn't matter

    (4, 4, 220048, 'HRy', 'ST', null),
    (4, 4, 220048, 'HRy', 'ST', null),
    (4, 4, 220048, 'HRy', 'AF', null),  --Here we have AF, which is not allowed. 
    (4, 4, 220048, 'HRy', 'ST', null),
    (4, 4, 220048, 'HRy', 'ST', null),
    (4, 4, 220048, 'HRy', 'SR', null),   
    (4, 4, 4053, 'BP', '87', 87),        

    (5, 5, 220048, 'HRy', 'SB', null),
    (5, 5, 220048, 'HRy', 'ST', null),
    (5, 5, 220048, 'HRy', 'SR', null),  --Here we have the 3 different types, all alowed. 
    (5, 5, 220048, 'HRy', 'SB', null),
    (5, 5, 220048, 'HRy', 'SR', null),
    (5, 5, 220048, 'HRy', 'SR', null),   
    (5, 5, 4053, 'BP', '87', 87),        

    (6, 6, 220048, 'HRy', 'SR', null), -- allowed
    (6, 6, 211, 'HRa2', '134', 134), -- doesn't matter
    (6, 6, 211, 'HRa2', '187', 187), -- doesn't matter
    (6, 6, 220048, 'HRy', 'AF', null), -- NOT allowed
    (6, 6, 220048, 'HRy', 'SR', null) -- allowed
;


output:
hadm_id, subject_id
1            1
2            2
3            3
5            5

How can I achieve this?

Many thanks!

Upvotes: 1

Views: 48

Answers (1)

404
404

Reputation: 8542

Here's one way to do it:

SELECT hadm_id, subject_id
FROM testevents
WHERE eventlabel = 'HRy'
GROUP BY hadm_id, subject_id
HAVING ARRAY_AGG(DISTINCT TRIM(value)) <@ ARRAY['SR', 'SB', 'ST']
ORDER BY hadm_id, subject_id

Returns:

enter image description here

Get all the distinct values per hadm_id and subject_id, and check that they're all contained within an array of the allowed values. The TRIM btw is because of the char(360) type, which pads the value with spaces.

Upvotes: 1

Related Questions