Reputation: 3
I have a table that has a column for timestamps and another column that has statuses. I want to grab the timestamp when the status is checked-in as well as the timestamp when it is completed into one row. When I try to use a case statement I end up with it split into two rows. I am wanting it to return one row with the values in each timestamp column rather than two rows one for each with the null in the other.
CASE WHEN aud.STATUS_DESCRIPTION = 'CHECKED_IN' THEN aud.STATUS_DATETIME
END AS "Check-In Time",
CASE WHEN aud.STATUS_DESCRIPTION = 'COMPLETED' THEN aud.STATUS_DATETIME
END AS "Completed Time",
Table with statuses and timestamps
What my case statement is returning
Thank you for any help
Upvotes: 0
Views: 310
Reputation: 25968
This is happening because your data is across many rows.
You ether need to do some form of aggregation, so GROUP BY and then using an aggregate function like MIN/MAX
OR You need to classify the data you want, and then use a PIVOT to do the aggregation for you.
The first might look like:
SELECT
some_column_a,
some_column_b,
MAX(IFF( aud.status_description = 'CHECKED_IN', aud.status_datetime, null)) as check_in_time
MAX(IFF( aud.status_description = 'COMPLETED', aud.status_datetime, null)) as complete_time
FROM table
GROUP BY some_column_a, some_column_b
ORDER BY some_column_a, some_column_b;
So adding a working example
WITH data AS (
SELECT to_date(column1) as STATUS_DATETIME,
column2 as STATUS_DESCRIPTION,
column3 as customer_id
FROM VALUES
('2021-12-11 11:12:03','CREATED', 1),
('2021-12-11 11:12:03','CHECKED_IN', 1),
('2021-12-11 11:22:49','PROGRESS', 1),
('2021-12-11 11:55:03','COMPLETED', 1),
('2021-10-11 11:55:03','COMPLETED', 0)
)
SELECT
aud.customer_id,
MAX(IFF( aud.status_description = 'CHECKED_IN', aud.status_datetime, null)) as check_in_time,
MAX(IFF( aud.status_description = 'COMPLETED', aud.status_datetime, null)) as complete_time
FROM data as aud
GROUP BY 1
ORDER BY 1;
This example works well if you have many customer_id
's and many entries per customer_id. If how every your table size is small, and you never have two records in the "completed" state then the join can work.
WITH data AS (
SELECT to_date(column1) as STATUS_DATETIME,
column2 as STATUS_DESCRIPTION,
column3 as customer_id
FROM VALUES
('2021-12-11 11:12:03','CREATED', 1),
('2021-12-11 11:12:03','CHECKED_IN', 1),
('2021-12-11 11:22:49','PROGRESS', 1),
('2021-12-11 11:55:03','COMPLETED', 1),
('2021-10-11 11:55:03','COMPLETED', 0)
)
SELECT
checked.customer_id,
checked.status_datetime as check_in_time,
completed.status_datetime as complete_time
FROM data as checked
JOIN data as completed
ON checked.customer_id = completed.customer_id
AND checked.STATUS_DESCRIPTION = 'CHECKED_IN'
AND completed.STATUS_DESCRIPTION = 'COMPLETED'
;
The place the join does not work is if you do not have both "completed" and "checked_in". For the above SQL there is no row for customer_id 0
. Because there is only one
So for that you need a full outer join, and then it makes sense to move the filters to a CTE (or sub select), like so:
WITH data AS (
SELECT to_date(column1) as STATUS_DATETIME,
column2 as STATUS_DESCRIPTION,
column3 as customer_id
FROM VALUES
('2021-12-11 11:12:03','CREATED', 1),
('2021-12-11 11:12:03','CHECKED_IN', 1),
('2021-12-11 11:22:49','PROGRESS', 1),
('2021-12-11 11:55:03','COMPLETED', 1),
('2021-10-11 11:55:03','COMPLETED', 0)
), completed_data AS (
SELECT STATUS_DATETIME, STATUS_DESCRIPTION, customer_id
FROM data
WHERE STATUS_DESCRIPTION = 'COMPLETED'
), checked_in_data AS (
SELECT STATUS_DATETIME, STATUS_DESCRIPTION, customer_id
FROM data
WHERE STATUS_DESCRIPTION = 'CHECKED_IN'
)
SELECT
COALESCE(checked.customer_id, completed.customer_id) AS customer_id,
checked.status_datetime as check_in_time,
completed.status_datetime as complete_time
FROM checked_in_data as checked
FULL OUTER JOIN completed_data as completed
ON checked.customer_id = completed.customer_id
ORDER BY 1,2;
;
which gives the output:
CUSTOMER_ID | CHECK_IN_TIME | COMPLETE_TIME |
---|---|---|
0 | 2021-10-11 | |
1 | 2021-12-11 | 2021-12-11 |
Upvotes: 1
Reputation: 751
This is just an example of how to use pivot
which is an addition to Simeon's answer.Using sample data from the image provided.
Table creation and data insertion:
create or replace temporary table _temp (
ts timestamp_ntz,
_status varchar
);
insert into _temp
values ('2021-12-11 11:12:03','created'),
('2021-12-11 11:12:03','checked_in'),
('2021-12-11 11:22:49','progress'),
('2021-12-11 11:55:03','completed');
Pivot query:
select *
from _temp
pivot(max(ts) for _status in ('checked_in', 'completed')) as p;
Result:
'checked_in' 'completed'
2021-12-11 11:12:03.000 2021-12-11 11:55:03.000
Note that I've used MAX
aggregate function which can be replaced by other aggregate functions. This would always return a single row if there are only 2 columns, to get a better sense of pivot have another column and take a look at examples provided in Pivot's doc.
Upvotes: 1
Reputation: 846
I'd start with self-join.
SELECT
chcecked.STATUS_DATETIME as CHECKED_IN_TIME,
completed.STATUS_DATETIME as COMPLETED_TIME
FROM
yourtable as checked
JOIN
yourtable as completed
ON ....
Upvotes: 1