Reputation: 1726
i have a record set of <100k rows, big query wont let me download this to my computer in csv format, claiming its very big and i have to store it first - is there any work around? I want to load my output into R and the easiest way is using csv
Upvotes: 3
Views: 3255
Reputation: 5920
I'm using the following python script for this task, it can handle large datasets without loading them into the memory.
Make sure to install the dependencies and change the variables:
pip install google.cloud google-cloud-bigquery
Change the variables, the query, project, output file, and the file encoding (if required) to fit your needs
from google.cloud import bigquery
import codecs
import csv
# Output file
output_file = "output.csv"
# GCP project
project="<some-project>"
# File encoding - utf-8-sig codec will remove BOM if present and support excel
file_encoding="utf-8-sig"
# The query to execute
query = """
SELECT * from my-table
"""
client = bigquery.Client(project=project)
query_job = client.query(query)
result = query_job.result()
schema = result.schema
with codecs.open(output_file,"w",encoding=file_encoding) as f:
writer = csv.writer(f)
# Write headers
header = [f_name.name for f_name in schema ]
writer.writerow(header)
# Write data to file
for row in query_job:
writer.writerow(row)
Upvotes: 0
Reputation: 207912
You need to use Google Cloud Storage for your export job. Exporting data from BigQuery is explained here, check also the variants for different path syntaxes.
Then you can download the files from GCS to your local storage. There is no way to directly download from BigQuery large data directly to your local computer. You need to do via GCS.
Upvotes: 4