Reputation: 1932
I'm really a freshman on the Google Cloud Platform. I'm trying to populate a BigQuery table with data pulled up from a bucket object CSV file. I created a Python test script to create and populate the table. The creation is done but when I run the file, I'm getting stuck.
My script:
from google.cloud import bigquery
from google.cloud.bigquery import LoadJobConfig
from google.cloud.bigquery import SchemaField
client = bigquery.Client()
dataset_ref = client.dataset('datasetname')
## Create the table
schema = [
bigquery.SchemaField('start_date', 'DATETIME', mode='NULLABLE'),
bigquery.SchemaField('start_station_code', 'INTEGER', mode='NULLABLE'),
bigquery.SchemaField('end_date', 'DATETIME', mode='NULLABLE'),
bigquery.SchemaField('end_station_code', 'INTEGER', mode='NULLABLE'),
bigquery.SchemaField('duration_sec', 'INTEGER', mode='NULLABLE'),
bigquery.SchemaField('is_member', 'INTEGER', mode='NULLABLE')
]
table_ref = dataset_ref.table('tablename')
table = bigquery.Table(table_ref, schema=schema)
table = client.create_table(table) # API request
## Loading data
SCHEMA = [
SchemaField('start_date', 'DATETIME', mode='NULLABLE'),
SchemaField('start_station_code', 'INTEGER', mode='NULLABLE'),
SchemaField('end_date', 'DATETIME', mode='NULLABLE'),
SchemaField('end_station_code', 'INTEGER', mode='NULLABLE'),
SchemaField('duration_sec', 'INTEGER', mode='NULLABLE'),
SchemaField('is_member', 'INTEGER', mode='NULLABLE')
]
#table_ref = client.dataset('dataset_name').table('table_name')
load_config = LoadJobConfig()
load_config.skip_leading_rows = 1
load_config.schema = SCHEMA
uri = 'gs://gcp-development/object.csv'
load_job = client.load_table_from_uri(
uri,
table_ref,
job_config=load_config)
load_job.result()
destination_table = client.get_table(table_ref)
print('Loaded {} rows.'.format(destination_table.num_rows))
Per the documentation, that seems correct. However, I'm getting the following error I don't understand, and I don't know how to check out the logs to get more details.
error:
google.api_core.exceptions.BadRequest: 400 Error while reading data, error message: CSV table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the error stream for more details.
Where is the error stream? I tried:
GET https://www.googleapis.com/bigquery/v2/projects/projectId/queries/jobId
following the troubleshooting documentation but I retrieved nothing.
Thanks for your help.
Upvotes: 1
Views: 1124
Reputation: 2893
I could load data by using your script with no issues at all. You could check the full error message within the Logs Viewer by selecting BigQuery in the first left menu. This may be related to a parsing error regarding the datetime fields. You could find more information about how to use the Logs Viewer in this document
The data sample that I used was the following:
start_date,start_station_code,end_date,end_station_code,duration_sec,is_member
0001-01-01 00:00:00.000000,1,9999-12-31 23:59:59.999999,2,3,4
0001-01-01 00:00:00.000000,2,9999-12-31 23:59:59.999999,3,4,5
0001-01-01 00:00:00.000000,3,9999-12-31 23:59:59.999999,4,5,6
0001-01-01 00:00:00.000000,4,9999-12-31 23:59:59.999999,5,6,7
0001-01-01 00:00:00.000000,5,9999-12-31 23:59:59.999999,6,7,8
0001-01-01 00:00:00.000000,6,9999-12-31 23:59:59.999999,7,8,9
0001-01-01 00:00:00.000000,7,9999-12-31 23:59:59.999999,8,9,10
0001-01-01 00:00:00.000000,8,9999-12-31 23:59:59.999999,9,10,11
0001-01-01 00:00:00.000000,9,9999-12-31 23:59:59.999999,10,11,12
0001-01-01 00:00:00.000000,10,9999-12-31 23:59:59.999999,11,12,13
Upvotes: 1