Reputation: 11
I am trying to export a table as CSV file in GCS bucket from Google Big Query console with export command, but facing issues with double quotes.
I want to export as csv but every column value should be surrounded with double quotes, i tried using various String functions such as FORMAT, CONCAT, REPLACE etc, but not able to get desired results.
I am trying to export as csv because only these formats are supported by Export command (CSV, JSON, Parquet, Avro) , even a plain text file is fine for me, but don't have that option and i need columns surrounded by double quotes and delimited by comma.
this is export command:
EXPORT DATA OPTIONS( uri='gs://bucket-name123/SOME-FILE-*', format='CSV', overwrite=true, header=false, field_delimiter=',')
AS
SELECT CONCAT('\"',column1,'\"'), CONCAT('\"',column2,'\"') from `projectId.datasetId.TABLE_NAME` where column3 = true;
Data is getting exported successfully in GCS bucket , but i am getting 3 double quotes surrounding this column instead of 1 double quotes:
content of GCS exported GCS file is as below:
"""value1""","""AAA""""
"""value2""","""BBB""""
"""value3""","""CCC""""
I am expecting this content format:
"value1","AAA"
"value2","BBB"
Any help or hint will be appreciated,
I have also tried FORMAT('%s',column1)
-- this function, but it also produces 3 double quotes
I also tried to hit EXPORT query as a job from Java Google BQ api, but same result.
I can't use "bq extract" because i want to only export limited columns, and extract don't have functionality to export limited columns, it exports whole table.
Upvotes: 1
Views: 1732
Reputation: 31
As mentioned earlier by others, the CSV export in BQ will automatically apply the Text Qualifiers only when the value being exported contains a delimiter.
Your best bet is to post-process the exported file. You can use SED
Upvotes: 0
Reputation: 1839
Don't add double quotes yourself. When you export to CSV from BigQuery, double quotes are added automatically, if they are needed, e.g. for multi-line strings and escaping double quotes (which is why you are seeing repeated double quotes in your output)
So you command should simply be:
EXPORT DATA OPTIONS(
uri='gs://bucket-name123/SOME-FILE-*',
format='CSV',
overwrite=true,
header=false,
field_delimiter=',')
AS SELECT column1, column2 from projectId.datasetId.TABLE_NAME
WHERE column3 = true;
Most CSV readers will be able to read the output format.
Upvotes: 0