Reputation: 157
My table looks like this in BigQuery
+--------------------+------------+--------+--------+
| TimeStamp | Session | Type | Result |
+--------------------+------------+--------+--------+
| 7/28/2020 16:05:02 | 1595952288 | Select | 53402 |
| 7/28/2020 16:05:16 | 1595952288 | Menu | 2 |
| 7/28/2020 16:05:25 | 1595952288 | Select | 53405 |
| 7/28/2020 16:05:36 | 1595952288 | Menu | 1 |
+--------------------+------------+--------+--------+
I would like to be able to pull the data to look like the below
+--------------------+------------+--------+--------+---------+
| TimeStamp | Session | Type | Result | Confirm |
+--------------------+------------+--------+--------+---------+
| 7/28/2020 16:05:02 | 1595952288 | Select | 53402 | 2 |
| 7/28/2020 16:05:25 | 1595952288 | Select | 53405 | 1 |
+--------------------+------------+--------+--------+---------+
Basically, what the data is is a person is within a Session in an application, they are entering a 5 digit number, then they are asked to confirm it.
I am trying to tie in the next line to the 5 digit number based on the next TimeStamp and bring it into the same line to report on it.
Here is the SQL I created, I don't think it is right as it produces duplicates. I think there is a better way and it's not very efficient. Actually, it might be just plain wrong.
Select DISTINCT
table1.Session,
table1.Result,
subtable1.Confirm
FROM
`googleBQ_table` as table1
Left Join (
Select
Result as Confirm,
Session,
MAX(TimeStamp)
FROM
`googleBQ_table`
WHERE Type = 'Menu' and LENGTH(Result) < 2
group by 1,2) as sub_table1 on sub_table1.Session = table1.Session
WHERE
table1.Session = '1595952288' and LENGTH(table1.Result) = 5
Thank you in advance.
Upvotes: 1
Views: 30
Reputation: 172993
Below is for BigQuery Standard SQL
within a Session in an application, they are entering a 5 digit number, then they are asked to confirm it.
based on above - I assume that confirmation row is always after select row - so below does it
#standardSQL
SELECT * FROM (
SELECT *, LEAD(result) OVER(confirmation) Confirm
FROM `project.dataset.table`
WINDOW confirmation AS (PARTITION BY session ORDER BY TimeStamp)
)
WHERE type = 'Select'
if applied to sample data from your question - result is
Row TimeStamp Session Type Result Confirm
1 7/28/2020 16:05:02 1595952288 Select 53402 2
2 7/28/2020 16:05:25 1595952288 Select 53405 1
Upvotes: 1
Reputation: 222472
Assuming that "select" and "menu" rows properly interleave, you can use window functions:
select timestamp, session, type, result, lead_result confirm
from (
select
t.*,
lead(case when type = 'Menu' then result end)
over(partition by session order by timestamp) lead_result
from `googleBQ_table` t
) t
where type = 'Select'
Upvotes: 1