Reputation: 23
I have a table with around 1000 time series in the columns. I need to calculate an ARIMA in GCP Big-Query for each one; how can I do it without creating an ARIMA model for each series, get a prediction for the next 3 periods and append it to the series table?.
Thank you
Upvotes: 0
Views: 249
Reputation: 208012
You need to define additional options in the CREATE MODEL
statement.
Here is an example:
#standardSQL
CREATE OR REPLACE MODEL bqml_tutorial.nyc_citibike_arima_model_group
OPTIONS
(model_type = 'ARIMA_PLUS',
time_series_timestamp_col = 'date',
time_series_data_col = 'num_trips',
time_series_id_col = 'start_station_name',
auto_arima_max_order = 5
) AS
SELECT
start_station_name,
EXTRACT(DATE from starttime) AS date,
COUNT(*) AS num_trips
FROM
`bigquery-public-data`.new_york.citibike_trips
WHERE start_station_name LIKE '%Central Park%'
GROUP BY start_station_name, date
The OPTIONS(model_type='ARIMA_PLUS', time_series_timestamp_col='date', ...)
clause indicates that you are creating a set of ARIMA-based time-series ARIMA_PLUS
models. In addition to time_series_timestamp_col
and time_series_data_col
, you must specify time_series_id_col
, which is used to annotate different input time series.
There is a full tutorial about this here (step 4 is your use case):
Upvotes: 1