manish patel
manish patel

Reputation: 1

is it possible to write/run BigQuery on parquet files on AWS S3?

We want to check BigQuery performance on external store parquet files. These parquet files are store on AWS S3. Without transfering files to GCP, Is possible to write BigQuery which can run on AWS S3 stored parquet files dataset.

Upvotes: 0

Views: 815

Answers (2)

Joaquim
Joaquim

Reputation: 406

You can use the BigQuery Data Transfer Service for Amazon S3 which allows you to automatically schedule and manage recurring loads jobs from Amazon S3 into BigQuery and allows loading data in Parquet format. In this link you will find the documentation on how to set up an Amazon S3 data transfer.

Upvotes: 1

Nathan Griffiths
Nathan Griffiths

Reputation: 12756

No, this is not possible. BigQuery supports "external tables" where the data exists as files in Google Cloud Storage but no other cloud file store is supported, including AWS S3.

You will need to either copy/move the files from S3 to Cloud Storage and then use BigQuery on them, or use a similar AWS service such as Athena to query the files in situ on S3.

Upvotes: 1

Related Questions