Reputation: 354
When pushing in data in csv format as given below:
G000021318, 17.0, New, 0.0, None, jan, 2010
Big query removes the G00000 and converts the field to an Integer.
Code to create the table as follows:
List<String> sources = new ArrayList<String>();
sources.add("gs://" + googleBucket + "/" + accountId + "/" + sourceFile + "_" + account.getSuffix() + "/part*");
loadConfig.setSourceUris(sources);
TableReference tableRef = new TableReference();
tableRef.setDatasetId(datasetId);
tableRef.setTableId(flagVolumeMonthTable + "_" + account.getSuffix());
tableRef.setProjectId(googleProjectId);
loadConfig.setDestinationTable(tableRef);
loadConfig.setFieldDelimiter(",");
loadConfig.setAutodetect(true);
Am I missing something or it is a bug in the Big query schema detection?
Upvotes: 2
Views: 1685
Reputation: 14014
The problem happened because BigQuery's autodetect code detected that G000021318
is ISO compliant format for the Haitian gourde currency due to G
prefix, and eagerly proceeded to interpret data as INT64
representing 21318
gourdes :)
We have fixed autodetect code to only react for unambiguous currency symbols such as $, €, £, ¥, ¢ etc.
P.S. The fix will propagate into production systems within weeks.
Upvotes: 4