BigQuery: NodeJS client library does not seem to respect useAvroLogicalTypes when performing a load job from Cloud Storage

Question

I am trying to run a job from my nodeJS server, which loads an AVRO file on Cloud Storage into a BigQuery table. The job works perfectly, however the date column is loaded as type INTEGER in the table. I included the useAvroLogicalTypes param in the load job but it doesn't seem to have any effect.

If I cast the date column in the table using a SELECT DATE(TIMESTAMP_MILLIS(date)) I do get the correct date, but hoping to avoid this extra transformation step. I have read everywhere that the avro logical types can be cast implicitly if the parameter is set, but I have been unable to get it to work. The table was created by the job, so there was no pre-existing schema.

Client library versions I am using are: 4.4.0 for @google-cloud/bigquery 4.1.2 for @google-cloud/storage.

AVRO SCHEMA:

const schema = {
    "name": "root",
    "type": "record",
    "fields": [
      { "name": "date", "type": ["null", { "type": "long", "logicalType": "date" }]},
      { "name": "medium", "type": ["null", "string"] },
      { "name": "source", "type": ["null", "string"] },
      { "name": "campaign", "type": ["null", "string"] },
    ]
  };

JOB CODE

const options = {
    sourceFormat: 'AVRO',
    writeDisposition: 'WRITE_TRUNCATE',
    useAvroLogicalTypes: true,
    datasetID,
  };

bigquery
    .dataset(datasetID)
    .table(tableID)
    .load(storage.bucket(bucketName).file(fileName), options)
    .then(results => {

      res = results[0];

      // load() waits for the job to finish
      console.log(`Job ${res.id} completed.`);

      // Check the job's status for errors
      const errors = res.status.errors;

      if (errors && errors.length > 0) {
        E = errors;
      }
      // This kicks the execution back to where the Fiber.yield() statement stopped it
      fiber.resume();
    })
    .catch(err => {
      console.error('ERROR:', err);
    });

Sample Raw data:

data = [
{"date":"2019-08-01","medium":"(none)","source":"(direct)","campaign":"(not set)","users":3053},
{"date":"2019-08-01","medium":"(not set)","source":"email-client","campaign":"(not set)","users":3},
{"date":"2019-08-01","medium":"affiliate","source":"sdn","campaign":"(not set)","users":1},
{"date":"2019-08-01","medium":"email","source":"corner","campaign":"onboarding","users":1},
{"date":"2019-08-01","medium":"email","source":"custom-playlist","campaign":"fonboarding","users":1},
{"date":"2019-08-01","medium":"email","source":"deref-mail.com","campaign":"(not set)","users":2},
{"date":"2019-08-01","medium":"email","source":"faketempmail","campaign":"(not set)","users":1},
{"date":"2019-08-01","medium":"email","source":"fundx","campaign":"email_campaign","users":1},
{"date":"2019-08-01","medium":"email","source":"email-client","campaign":"(not set)","users":14},
{"date":"2019-08-01","medium":"email","source":"email-client","campaign":"100k","users":2},
]

I convert the date property to long using momentJS and underscoreJS and a simple map function:

data = _.map(data, row => {
    row.date = moment(row.date).isValid() ? +moment(row.date).valueOf() : null;
    return row;
  });

BigQuery: NodeJS client library does not seem to respect useAvroLogicalTypes when performing a load job from Cloud Storage

Answers (1)

Related Questions