Reputation: 45
I want to load data from MySQL to BigQuery using Cloud Dataflow. Anyone can share article or work experience about load data from MySQL to BigQuery using Cloud Dataflow with Python language?
Thank you
Upvotes: 1
Views: 1363
Reputation: 6572
If you only want to copy data from MySQL
to BigQuery
, you can firstly export your MySql
data to Cloud Storage
, then load this file to a BigQuery
table.
I think no need using Dataflow
in this case because you don't have complex transformations and business logics. It only corresponds to a copy.
Export the MySQL
data to Cloud Storage
via a sql
query and gcloud
cli :
gcloud sql export csv INSTANCE_NAME gs://BUCKET_NAME/FILE_NAME \
--database=DATABASE_NAME \
--offload \
--query=SELECT_QUERY \
--quote="22" \
--escape="5C" \
--fields-terminated-by="2C" \
--lines-terminated-by="0A"
Load the csv
file to a BigQuery
table via gcloud
cli and bq
:
bq load \
--source_format=CSV \
mydataset.mytable \
gs://mybucket/mydata.csv \
./myschema.json
./myschema.json
is the BigQuery
table schema.
Upvotes: 0
Reputation: 1428
You can use apache_beam.io.jdbc to read from your MySQL database, and the BigQuery I/O to write on BigQuery.
Beam knowledge is expected, so I recommend looking at Apache Beam Programming Guide first.
If you are looking for something pre-built, we have the JDBC to BigQuery Google-provided template, which is open-source (here), but it is written in Java.
Upvotes: 2