Bushmaster
Bushmaster

Reputation: 4608

How to automatically update data in google big query using python?

My stages:

I'm a beginner. Therefore when I want to update the data, I delete the table in the google big query and run the python code again. Now I want to update the data automatically. Is it possible to do this using python?

Upvotes: 0

Views: 2257

Answers (2)

Matteo Felici
Matteo Felici

Reputation: 1107

I would suggest using the Bigquery Python API. You can install it with pip install google-cloud-bigquery. Then you can

from google.cloud import bigquery

# Connect to Bigquery
client = bigquery.Client(project=your_project_id)

# Pull data to DF
df = client.query('select * from your_dataset.your_table').to_dataframe()

# Write table to Bigquery
job = client.load_table_from_dataframe(df, 'your_dataset.your_table')

# If you want to overwrite an existing table
job_config = bigquery.LoadJobConfig(
    write_disposition="WRITE_TRUNCATE",
)
job = client.load_table_from_dataframe(
     df, 'your_dataset.your_existing_table', job_config=job_config
)

Upvotes: 1

Vibhor Gupta
Vibhor Gupta

Reputation: 709

One of the solution to load data from relation data base to Bigquery is through Apache Beam (as dataflow runner or Local runner) depending on data volume and available infra for data processing.

  1. Beam - MYSQL Connector https://pypi.org/project/beam-mysql-connector/
  2. Apache Beam Python SDK: https://beam.apache.org/documentation/sdks/python/
  3. Dataflow runner: https://cloud.google.com/dataflow
  4. Video: https://www.youtube.com/watch?v=crKdfh63-OQ

Upvotes: 0

Related Questions