jaycode
jaycode

Reputation: 2958

Google BigQuery: "Invalid source format CSV" when loading data from Datalab

I have some files stored in Google Cloud Storage, and I run the following from Google Cloud Datalab's Notebook:

path = r'data/path/to/csv/orders_1234'
orders_path = os.path.join(bucket_path, path)
print(orders_path)
dest = 'project_name:dataset_name.orders'
%bigquery load -m append -f csv -d '\t' -S $orders_path -D $dest

But I got error Invalid source format CSV when running this code. Do you know what I may have done incorrectly here?

Thank you!

Upvotes: 1

Views: 619

Answers (1)

Anthonios Partheniou
Anthonios Partheniou

Reputation: 1709

Please try the following working example taken from the datalab sample notebooks repository.

import datalab.bigquery as bq
import datalab.storage as storage
import pandas as pd
from StringIO import StringIO

# Create the schema, conveniently using a DataFrame example.
%storage read --object gs://cloud-datalab-samples/cars.csv --variable cars
df = pd.read_csv(StringIO(cars))
schema = bq.Schema.from_dataframe(df)

# Create the dataset
bq.Dataset('sample').create()

# Create the table
sample_table = bq.Table('sample.cars').create(schema = schema, overwrite = True)

# Load csv file from GCS to Google BigQuery
sample_table.load('gs://cloud-datalab-samples/cars.csv', mode='append',
                  source_format = 'csv', csv_options=bq.CSVOptions(skip_leading_rows = 1))   

Please note: If you are running an older version of datalab, you may have to use import gcp.bigquery as bq instead of import datalab.bigquery as bq

You may be able to see additional error message information on the Job History page in the BigQuery console.

Upvotes: 1

Related Questions