Reputation: 11
I can create BigQuery ML models from the Google Big Query Web UI, but I'm trying to keep all of my code in python notebooks. Is there any way that I can create the models from the notebook without having to jump out to the web UI? I am able to use the predict function for creating model results from the Jupyter Notebook.
Thanks.
Upvotes: 0
Views: 585
Reputation: 208042
You don't need to do anything special, just run as a standalone query.
Create your dataset
Enter the following code to import the BigQuery Python client library and initialize a client. The BigQuery client is used to send and receive messages from the BigQuery API.
from google.cloud import bigquery
client = bigquery.Client(location="US")
Next, you create a BigQuery dataset to store your ML model. Run the following to create your dataset:
dataset = client.create_dataset("bqml_tutorial")
Create your model
Next, you create a logistic regression model using the Google Analytics sample dataset for BigQuery. The model is used to predict whether a website visitor will make a transaction. The standard SQL query uses a CREATE MODEL
statement to create and train the model. Standard SQL is the default query syntax for the BigQuery python client library.
The BigQuery python client library provides a cell magic, %%bigquery
, which runs a SQL query and returns the results as a Pandas DataFrame.
To run the CREATE MODEL
query to create and train your model:
%%bigquery
CREATE OR REPLACE MODEL `bqml_tutorial.sample_model`
OPTIONS(model_type='logistic_reg') AS
SELECT
IF(totals.transactions IS NULL, 0, 1) AS label,
IFNULL(device.operatingSystem, "") AS os,
device.isMobile AS is_mobile,
IFNULL(geoNetwork.country, "") AS country,
IFNULL(totals.pageviews, 0) AS pageviews
FROM
`bigquery-public-data.google_analytics_sample.ga_sessions_*`
WHERE
_TABLE_SUFFIX BETWEEN '20160801' AND '20170630'
The query takes several minutes to complete. After the first iteration is complete, your model (sample_model) appears in the navigation panel of the BigQuery web UI. Because the query uses a CREATE MODEL statement to create a table, you do not see query results. The output is an empty DataFrame.
Upvotes: 1