APB
APB

Reputation: 45

Load data from MySQL to BigQuery using Dataflow

I want to load data from MySQL to BigQuery using Cloud Dataflow. Anyone can share article or work experience about load data from MySQL to BigQuery using Cloud Dataflow with Python language?

Thank you

Upvotes: 1

Views: 1363

Answers (2)

Mazlum Tosun
Mazlum Tosun

Reputation: 6572

If you only want to copy data from MySQL to BigQuery, you can firstly export your MySql data to Cloud Storage, then load this file to a BigQuery table.

I think no need using Dataflow in this case because you don't have complex transformations and business logics. It only corresponds to a copy.

Export the MySQL data to Cloud Storage via a sql query and gcloud cli :

gcloud sql export csv INSTANCE_NAME gs://BUCKET_NAME/FILE_NAME \
--database=DATABASE_NAME \
--offload \
--query=SELECT_QUERY \
--quote="22" \
--escape="5C" \
--fields-terminated-by="2C" \
--lines-terminated-by="0A"

Load the csv file to a BigQuery table via gcloud cli and bq :

bq load \
  --source_format=CSV \
  mydataset.mytable \
  gs://mybucket/mydata.csv \
  ./myschema.json

./myschema.json is the BigQuery table schema.

Upvotes: 0

Bruno Volpato
Bruno Volpato

Reputation: 1428

You can use apache_beam.io.jdbc to read from your MySQL database, and the BigQuery I/O to write on BigQuery.

Beam knowledge is expected, so I recommend looking at Apache Beam Programming Guide first.

If you are looking for something pre-built, we have the JDBC to BigQuery Google-provided template, which is open-source (here), but it is written in Java.

Upvotes: 2

Related Questions