akshah123
akshah123

Reputation: 778

BigQuery converting to a different timezone

I am storing data in unixtimestamp on google big query. However, when the user will ask for a report, she will need the filtering and grouping of data by her local timezone.

The data is stored in GMT. The user may wish to see the data in EST. The report may ask the data to be grouped by date.

I don't see the timezone conversion function here:

Does anyone know how I can do this in bigquery? i.e. how do i group by after converting the timestamp to a different timezone?

Upvotes: 48

Views: 236527

Answers (7)

Philippe Hebert
Philippe Hebert

Reputation: 2028

To those that stumble here:

How to convert a timestamp to another timezone?

Given that TIMESTAMP values, once constructed, are stored as UTC, and that TIMESTAMP does not have a constructor (TIMESTAMP, STRING), you can convert a timestamp to another time zone by transforming it first to a DATETIME and then constructing the new TIMESTAMP from the DATETIME in the new timezone:

SELECT TIMESTAMP(DATETIME(timestamp_field, '{timezone}'))

Example:

SELECT
    input_tz,
    input,
    'America/Montreal' AS output_tz,
    TIMESTAMP(DATETIME(input,'America/Montreal')) AS output
FROM (
    SELECT 'US/Pacific' AS input_tz, TIMESTAMP(DATETIME(DATE(2021, 1, 1), TIME(16, 0, 0)), 'US/Pacific') AS input
    UNION ALL
    SELECT 'UTC' AS input_tz, TIMESTAMP(DATETIME(DATE(2021, 1, 1), TIME(16, 0, 0)), 'UTC') AS input
    UNION ALL
    SELECT 'Europe/Berlin' AS input_tz, TIMESTAMP(DATETIME(DATE(2021, 1, 1), TIME(16, 0, 0)), 'Europe/Berlin') AS input
) t

results in:

Row input_tz input output_tz output
1 US/Pacific 2021-01-02 00:00:00 UTC America/Montreal 2021-01-01 19:00:00 UTC
2 UTC 2021-01-01 16:00:00 UTC America/Montreal 2021-01-01 11:00:00 UTC
3 Europe/Berlin 2021-01-01 15:00:00 UTC America/Montreal 2021-01-0110:00:00 UTC

How to strip time zone info from a DATETIME value?

DATETIME in BigQuery are time zone naive, such that they do not contain timezone info. This being said, if you have business knowledge that allows you to know the timezone of a DATETIME, you can strip that timezone offset by converting it to a TIMESTAMP with the known timezone:

SELECT TIMESTAMP(datetime_value, '{timezone}')

Given that the TIMESTAMP stores the value in UTC, you can then re-convert to DATETIME if that's your preferred method of storage, but now you'll know that your DATETIME is in UTC :)

Hopefully this can be helpful! :)

Upvotes: 11

zaan
zaan

Reputation: 897

For me TIMESTAMP_SUB and TIMESTAMP_ADD functions did the job. When needed to convert timestamp from UTC to PST I used:

TIMESTAMP_SUB(`timestamp`, INTERVAL 8 HOUR)

Upvotes: -2

Nishant
Nishant

Reputation: 425

To convert any TimeZone DateTime string to UTC, one could use PARSE_TIMESTAMP using the supported TIMESTAMP Formats in BigQuery.

For example to convert IST (Indian Standard Time) string to UTC, use the following:

SAFE.PARSE_TIMESTAMP("%a %b %d %T IST %Y", timeStamp_vendor, "Asia/Kolkata")

Here PARSE_TIMESTAMP parses the IST string to a UTC TIMESTAMP (not string). Adding SAFE as prefix takes care of errors/nulls etc.

To convert this to a readable string format in BigQuery, use FORMAT_TIMESTAMP as follows:

FORMAT_TIMESTAMP("%d-%b-%Y %T %Z", SAFE.PARSE_TIMESTAMP("%a %b %d %T IST %Y", timeStamp_vendor, "Asia/Kolkata"))

This example would take an IST string of the format Fri May 12 09:45:12 IST 2019 and convert it to 12-May-2019 04:15:12 UTC.

Replace IST with the required TimeZone and Asia/Kolkata with relevant Timezone name to achieve the conversion for your timezone

Upvotes: 4

Michael Manoochehri
Michael Manoochehri

Reputation: 7877

2016 update: Look answers below, BigQuery now provides timestamp and timezone methods.


You are right - BigQuery doesn't provide any timestamp conversion methods.

In this case, I suggest that you run your GROUP BY based on dimensions of the GMT/UTC timestamp field, and then convert and display the result in the local timezone in your code.

Upvotes: 3

Mani Gandham
Mani Gandham

Reputation: 8272

Standard SQL in BigQuery has built-in functions:

DATE(timestamp_expression, timezone)
TIME(timestamp, timezone)
DATETIME(timestamp_expression, timezone)

Example:

SELECT 
   original,
   DATETIME(original, "America/Los_Angeles") as adjusted
FROM sometable;

+---------------------+---------------------+
| original            | adjusted            |
+---------------------+---------------------+
| 2008-12-25 05:30:00 | 2008-12-24 21:30:00 |
+---------------------+---------------------+

You can use standard IANA timezone names or offsets.

Upvotes: 68

milyord
milyord

Reputation: 1029

As of September 2016 BigQuery has adopted standard SQL and you can now just use the "DATE(timestamp, timezone)" function to offset for a timezone. You can reference their docs here:

BigQuery DATE docs

Upvotes: 36

mdahlman
mdahlman

Reputation: 9390

Your premise is right. If you group like this, then users who want EST or EDT will get incorrect date grouping:

GROUP BY UTC_USEC_TO_DAY(ts_field)

But as long as you figure out the offset that your user wants, you can still do the full calculation on the server. For example, if EST is 5 hours behind UTC then query like this:

GROUP BY UTC_USEC_TO_DAY(ts_field - (5*60*60*1000*1000000) )

Just parameterize the "5" to be the offset in hours, and you're all set. Here's a sample based on one of the sample data sets:

SELECT
  COUNT(*) as the_count,
  UTC_USEC_TO_DAY(timestamp * 1000000 - (5*60*60*1000*1000000) ) as the_day
FROM
  [publicdata:samples.wikipedia]
WHERE
  comment CONTAINS 'disaster'
  and timestamp >= 1104537600
GROUP BY
  the_day
ORDER BY
  the_day

You can remove the offset to see how some edits move to different days.

Upvotes: 4

Related Questions