bigdanbenjamin
bigdanbenjamin

Reputation: 11

Creating BigQueryML Model From Jupyter Notebook

I can create BigQuery ML models from the Google Big Query Web UI, but I'm trying to keep all of my code in python notebooks. Is there any way that I can create the models from the notebook without having to jump out to the web UI? I am able to use the predict function for creating model results from the Jupyter Notebook.

Thanks.

Upvotes: 0

Views: 585

Answers (1)

Pentium10
Pentium10

Reputation: 208042

You don't need to do anything special, just run as a standalone query.

Create your dataset

Enter the following code to import the BigQuery Python client library and initialize a client. The BigQuery client is used to send and receive messages from the BigQuery API.

from google.cloud import bigquery
​
client = bigquery.Client(location="US")

Next, you create a BigQuery dataset to store your ML model. Run the following to create your dataset:

dataset = client.create_dataset("bqml_tutorial")

Create your model

Next, you create a logistic regression model using the Google Analytics sample dataset for BigQuery. The model is used to predict whether a website visitor will make a transaction. The standard SQL query uses a CREATE MODEL statement to create and train the model. Standard SQL is the default query syntax for the BigQuery python client library.

The BigQuery python client library provides a cell magic, %%bigquery, which runs a SQL query and returns the results as a Pandas DataFrame.

To run the CREATE MODEL query to create and train your model:

%%bigquery
CREATE OR REPLACE MODEL `bqml_tutorial.sample_model`
OPTIONS(model_type='logistic_reg') AS
SELECT
  IF(totals.transactions IS NULL, 0, 1) AS label,
  IFNULL(device.operatingSystem, "") AS os,
  device.isMobile AS is_mobile,
  IFNULL(geoNetwork.country, "") AS country,
  IFNULL(totals.pageviews, 0) AS pageviews
FROM
  `bigquery-public-data.google_analytics_sample.ga_sessions_*`
WHERE
  _TABLE_SUFFIX BETWEEN '20160801' AND '20170630'

The query takes several minutes to complete. After the first iteration is complete, your model (sample_model) appears in the navigation panel of the BigQuery web UI. Because the query uses a CREATE MODEL statement to create a table, you do not see query results. The output is an empty DataFrame.

Upvotes: 1

Related Questions