Xinwei Liu
Xinwei Liu

Reputation: 353

TableRow.get("field_name") can only be cast to String in Dataflow ParDo

I am exporting a table from BQ by dataflow and it seems when processed by ParDo, I could only get the "string" value of data of each field in TableRow regardless of what originally the data type is in BQ schema.

For example, say my table has a INTEGER typed column "fieldA":

     public void processElement(ProcessContext c) throws Exception {
         TableRow row = c.element();
         String str = (String) c.get("fieldA"); // OK
         Integer i = (Integer) c.get("fieldA"); // Throw "String cannot be cast to Integer" exception
     }

Is it a bug or it is only me? If not only me, is there anyway to get around it? For integer type I could still do Integer.valueOf(String) but it will have to be a little bit hacky and err-prone when parsing Timestamp field.

FYI, I am using BlockDataflowPipelineRunner

Upvotes: 1

Views: 1455

Answers (1)

jkff
jkff

Reputation: 17913

According to BigQueryTableRowIterator:

Note that integers are encoded as strings to match BigQuery's exported JSON format.

So you'll need to Integer.parseInt. Sorry for the trouble, we should improve the documentation about typing of values in the TableRow when reading from BigQueryIO.Read - this documentation is not very discoverable.

Upvotes: 0

Related Questions