foxwendy
foxwendy

Reputation: 2929

Why double quote and <N> cause errors when upload to BigQuery?

Errors were reported when my program tried to upload a .csv file, via job upload to BigQuery:

Job failed while writing to Bigquery. invalid: Too many errors encountered. Limit is: 0. at 
Error: [REASON] invalid [MESSAGE] Data between close double quote (") and field separator: field starts with: <N> [LOCATION] File: 0 / Line:21470 / Field:2
Error: [REASON] invalid [MESSAGE] Too many errors encountered. Limit is: 0. [LOCATION] 

I traced back to my file and did find the specified line like:

3D0F92F8-C892-4E6B-9930-6FA254809E58~"N" STYLE TOWING~1~0~5.7.1512.441~10.20.10.25:62342~MSSqlServer: N_STYLE on localhost~3~2015-12-17 01:56:41.720~1~<?xml version="1

The delimiter was set to be ~ , then why the double quote or maybe <N> is a problem?

Upvotes: 1

Views: 1708

Answers (1)

Jordan Tigani
Jordan Tigani

Reputation: 26617

The specification for csv says that if there is a quote in the field, then the entire field should be quoted. As in a,b,"c,d", which would have only three fields, since the third comma is quoted. The csv parser gets confused when there is data after a closing quote but before the next delimiter, as in a,b,"c,d"e.

You can fix this by specifying a custom quote character, since it sounds like you don't need a quote char at all, so you could just set it to something that you'll never see, like \0 or |. You're already setting configuration.load.delimiter, just set configuration.load.quote as well.

Upvotes: 1

Related Questions