Reputation: 85
Good day, all. I wrote a question relating to this earlier, but now I have encountered another problem.
I have to calculate the timestamp difference between the install_time and contributer_time columns. HOWEVER, I have three contributor_time columns, and I need to select the latest time from those columns first then subtract it from install time.
Sample Data
users | install_time | contributor_time_1 | contributor_time_2 | contributor_time_3 |
---|---|---|---|---|
1 | 8:00 | 7:45 | 7:50 | 7:55 |
2 | 10:00 | 9:15 | 9:45 | 9:30 |
3 | 11:00 | 10:30 | null | null |
For example, in the table above I would need to select contributor_time_3 and subtract it from install_time for user 1. For user 2, I would do the same, but with contributor_time_2.
Sample Results
users | install_time | time_diff_min |
---|---|---|
1 | 8:00 | 5 |
2 | 10:00 | 15 |
3 | 11:00 | 30 |
The problem I am facing is that 1) the contributor_time columns are in string format and 2) some of them have 'null' string values (which means that I cannot cast it into a timestamp.)
I created a query, but I am am facing an error stating that I cannot subtract a string from timestamp. So I added safe_cast, however the time_diff_min results are only showing when I have all three contributor_time columns as a timestamp. For example, in the sample table above, only the first two rows will pull.
The query I have so far is below:
SELECT
users,
install_time,
TIMESTAMP_DIFF(install_time, greatest(contributor_time_1, contributor_time_2, contributor_time_3), MINUTE) as ctct_min
FROM
(SELECT
users,
install_time,
safe_cast(contributor_time_1 as timestamp) as contributor_time_1,
safe_cast(contributor_time_2 as timestamp) as contributor_time_2,
safe_cast(contributor_time_3 as timestamp) as contributor_time_3,
FROM
(SELECT
users,
install_time,
case when contributor_time_1 = 'null' then '0' else contributor_time_1 end as contributor_time_1,
....
FROM datasource
Any help to point me in the right direction is appreciated! Thank you in advance!
Upvotes: 0
Views: 1992
Reputation: 173190
Consider below
select users, install_time,
time_diff(
parse_time('%H:%M',install_time),
greatest(
parse_time('%H:%M',contributor_time_1),
parse_time('%H:%M',contributor_time_2),
parse_time('%H:%M',contributor_time_3)
),
minute) as time_diff_min
from `project.dataset.table`
if applied to sample data in your question - output is
Above can be refactored slightly into below
create temp function latest_time(arr any type) as ((
select parse_time('%H:%M',val) time
from unnest(arr) val
order by time desc
limit 1
));
select users, install_time,
time_diff(
parse_time('%H:%M',install_time),
latest_time([contributor_time_1, contributor_time_2, contributor_time_3]),
minute) as time_diff_min
from `project.dataset.table`
less verbose and no redundant parsing - with same result - so just matter of preferences
Upvotes: 2
Reputation: 1271151
You can use greatest()
:
select t.*,
time_diff(install_time, greatest(contributor_time_1, contributor_time_2, contributor_time_3), minute) as diff_min
from t;
Note: this assumes that the values are never NULL
, which seems reasonable based on your sample data.
Upvotes: 1