rshetye
rshetye

Reputation: 667

bigquery UDF support

I am very new to BigQuery by google

I want to parse time stamp (yyyy/mm/dd:hh:mm:ss) based on the day and the month wish to bucket days into weeks. I didn't find any BigQuery function which does this.

Hence, I was wondering if there was a way in which I can write a UDF and then access it in a BigQuery query

Upvotes: 1

Views: 1619

Answers (2)

Craig Citro
Craig Citro

Reputation: 6625

There are two questions here, so two answers:

  • BigQuery does support UDFs: docs. (It didn't when I first answered this.)

  • Even without UDFs, the date bucketing is still doable. BigQuery has one time parsing function, PARSE_UTC_USEC, which is expecting input in the form YYYY-MM-DD hh:mm:ss. You'll need to use REGEXP_REPLACE to get your date into the right format. Once you've done that, UTC_USEC_TO_WEEK will block things into weeks, and you can group by that. So tying all that together, if your table has a column called timestamp, you could get counts by week via something like

    SELECT week, COUNT(week)
    FROM (SELECT UTC_USEC_TO_WEEK(
                   PARSE_UTC_USEC(
                     REGEXP_REPLACE(
                       timestamp, 
                       r"(\d{4})/(\d{2})/(\d{2}):(\d{2}):(\d{2}):(\d{2})", 
                       r"\1-\2-\3 \4:\5:\6")), 0) AS week
          FROM mytable) 
    GROUP BY week;
    

    Note that the 0 here is the argument for which day of the week to use as the "beginning"; I've used Sunday, but for "business"-y things using 1 (i.e. Monday) would likely make more sense.

Just in case you need it, the section on timestamp functions in the docs is helpful.

Upvotes: 2

thomaspark
thomaspark

Reputation: 488

UDF support in BigQuery is now here! https://cloud.google.com/bigquery/user-defined-functions

Here is some code that will convert a string time specifier into a JavaScript Date object, and extract some properties from it; see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Date for information on properties available for JS dates.

QUERY (replace the nested select with your table):

SELECT day_of_week, month_date
FROM parseDate(select '2015/08/01 12:00:00' as date_string);

CODE:

function parsedate(row, emit) {
  var d = new Date(row.date_string);
  emit({day_of_week: d.getDay(),
        month_date: d.getDate()});
}

bigquery.defineFunction(
  'parseDate',                           // Name of the function exported to SQL
  ['date_string'],                    // Names of input columns
  [{'name': 'day_of_week', 'type': 'integer'},
   {'name': 'month_date', 'type': 'integer'}],
  parsedate
);

Upvotes: 1

Related Questions