Reputation: 353
I am exporting a table from BQ by dataflow and it seems when processed by ParDo, I could only get the "string" value of data of each field in TableRow
regardless of what originally the data type is in BQ schema.
For example, say my table has a INTEGER
typed column "fieldA":
public void processElement(ProcessContext c) throws Exception {
TableRow row = c.element();
String str = (String) c.get("fieldA"); // OK
Integer i = (Integer) c.get("fieldA"); // Throw "String cannot be cast to Integer" exception
}
Is it a bug or it is only me? If not only me, is there anyway to get around it? For integer type I could still do Integer.valueOf(String)
but it will have to be a little bit hacky and err-prone when parsing Timestamp
field.
FYI, I am using BlockDataflowPipelineRunner
Upvotes: 1
Views: 1455
Reputation: 17913
According to BigQueryTableRowIterator:
Note that integers are encoded as strings to match BigQuery's exported JSON format.
So you'll need to Integer.parseInt
. Sorry for the trouble, we should improve the documentation about typing of values in the TableRow
when reading from BigQueryIO.Read
- this documentation is not very discoverable.
Upvotes: 0