Reputation: 59
I'm taking the min and max of several data points, but some of them have the patient going to the same locations on different days. Is there a way to add an index or code that distinguishes them as separate groups, based on an order by statement or something similar? I want to distinguish them as separate groups (by separating that the patient left one location, went to a new one, and then returned to the original location (treating it as a new location in terms of grouping)?
Here's the code to make a dummy data table.
CREATE TABLE #StartandStop (
[hospitalname] varchar(255),
pt_id varchar(255),
[Location] varchar(255),
startdatetime datetime,
enddatetime datetime
);
INSERT INTO #StartandStop ([hospitalname],[pt_id],[Location],[startdatetime],[enddatetime])
VALUES ('ClevelandClinic', '1','A','01-01-2020 10:35', '01-02-2020 12:30'),
('ClevelandClinic', '1','B','01-02-2020 12:30', '01-03-2020 00:00'),
('ClevelandClinic', '1','B','01-03-2020 00:00', '01-04-2020 00:00'),
('ClevelandClinic', '1','B','01-04-2020 00:00', '01-04-2020 19:30'),
('ClevelandClinic', '1','C','01-04-2020 19:30', '01-05-2020 18:44'),
('ClevelandClinic', '1','B','01-05-2020 18:44', '01-05-2020 07:09'),
('ClevelandClinic', '1','D','01-05-2020 07:09', '01-05-2020 22:22'),
('ClevelandClinic', '1','D','01-05-2020 22:22', '01-06-2020 00:00'),
('ClevelandClinic', '1','D','01-06-2020 00:00', '01-07-2020 00:00'),
('ClevelandClinic', '1','D','01-07-2020 00:00', '01-08-2020 00:00'),
('ClevelandClinic', '1','C','01-08-2020 00:00', '01-08-2020 20:02');
This is the code to get results.
select hospitalname,pt_id,location,min(startdatetime) as 'Start',max(enddatetime) as 'End'
from #StartandStop
group by hospitalname,pt_id,location
These are the results I get when just taking the minimum start and maximum stop of each, without taking into account the patient's movements back and forth between units.
hospitalname pt_id location Start End
ClevelandClinic 1 A 2020-01-01 10:35:00.000 2020-01-02 12:30:00.000
ClevelandClinic 1 B 2020-01-02 12:30:00.000 2020-01-05 07:09:00.000
ClevelandClinic 1 C 2020-01-04 19:30:00.000 2020-01-08 20:02:00.000
ClevelandClinic 1 D 2020-01-05 07:09:00.000 2020-01-08 00:00:00.000
These are the results I'm trying to get:
Hospital Patient Location Start Stop
ClevelandClinic 1 A 2020-01-01 10:35:00.000 2020-01-02 12:30:00.000
ClevelandClinic 1 B 2020-01-02 12:30:00.000 2020-01-04 19:30:00.000
ClevelandClinic 1 C 2020-01-04 19:30:00.000 2020-01-05 18:44:00.000
ClevelandClinic 1 B 2020-01-05 18:44:00.000 2020-01-05 07:09:00.000
ClevelandClinic 1 D 2020-01-05 07:09:00.000 2020-01-08 00:00:00.000
ClevelandClinic 1 C 2020-01-08 00:00:00.000 2020-01-08 20:02:00.000
Please advise if there's a way to mark or distinguish first and second visits to the same location as different. Thank you!
Upvotes: 0
Views: 93
Reputation: 121
ALTER TABLE #StartandStop ADD color INT
DECLARE @color INT
SELECT @color = 0
UPDATE ss SET color = @color, @color = CASE WHEN s.Location <> s.LastLocation THEN @color + 1 ELSE @color END
FROM #StartandStop ss JOIN (SELECT hospitalname, pt_id, Location, LAG(location) OVER (ORDER BY hospitalname,pt_id,startdatetime,location) AS LastLocation, startdatetime, enddatetime
FROM #StartandStop) s ON (ss.hospitalname = s.hospitalname AND ss.pt_id = s.pt_id AND ss.location = s.location AND ss.startdatetime = s.startdatetime AND ss.enddatetime = s.enddatetime)
SELECT hospitalname AS Hospital, pt_id AS Patient, MIN(Location) AS Location, MIN(startdatetime) AS Start, MAX(enddatetime) AS Stop
FROM #StartandStop
GROUP BY hospitalname,pt_id,color
ORDER BY 1,2,4,3
Upvotes: 1