sag
sag

Reputation: 5451

Where the data will be stored by BigQuery

I am using BigQueryIO to publish data into BigQuery from a Google Dataflow job.

AFAIK, BigQuery can be used to query data from Google Cloud Storage, Google Drive and Google Sheets.

But when we store data using BigQueryIO, where the data will stored? Is it in Google Cloud Storage?

Upvotes: 1

Views: 4750

Answers (3)

Pentium10
Pentium10

Reputation: 207912

BigQuery is a managed data warehouse, simply say it's a database.

So your data will be stored in BigQuery, and you can acccess it by using SQL queries.

Upvotes: 1

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172993

Short answer - BigQueryIO Write/Read to/from BigQuery Table

To go a little deeper:
BigQuery stores data in the Capacitor columnar data format, and offers the standard database concepts of tables, partitions, columns, and rows.

It manages the technical aspects of storing your structured data, including compression, encryption, replication, performance tuning, and scaling.

You can read more about BigQuery different components in BigQuery Overview

Upvotes: 6

Paul
Paul

Reputation: 27423

Cloud Storage is a separate service from Big Query. Internally, Big Query manages its own storage.

So, if you save your data to Cloud Storage, and then use the bq command to load a Big Query table from a file in Cloud Storage, there are now 2 copies of the data.

Consequences include:

  • If you delete the Cloud Storage copy, the data will still be in Big Query.
  • Fees include a price for each copy. I think in April 2017 long term storage in BQ is around $0.01/GB, and in cloud storage around $0.01-$0.026/GB depending on storage class.
  • If the same data is in both GCS and BQ, you are paying twice. Whether it is worthwhile to have a backup copy of data is up to you.

Upvotes: 5

Related Questions