Reputation: 1365
I'm using node.js to:
avsc
NPM package@google-cloud/storage
NPM package@google-cloud/bigquery
NPM package to load the Avro from GCS to BQMy problem is that even if I set useAvroLogicalTypes
when I create the load job, my date data is never correctly created as TIMESTAMP
in BQ, always INTEGER
- which should be the normal behavior when useAvroLogicalTypes
is NOT set.
According to the same docs, if timestamp-millis
is set in the Avro schema definition, it should load as TIMESTAMP
.
My configuration:
{
"name": "metadata",
"type": {
"name": "metadata",
"type": "record",
"fields": [
{
"name": "creationTime",
"type": "long",
"logicalType": "timestamp-millis"
},
{
"name": "lastActivity",
"type": ["null", "long"],
"logicalType": "timestamp-millis"
},
{
"name": "deletionTime",
"type": ["null", "long"],
"logicalType": "timestamp-millis"
},
{
"name": "lastSignInTime",
"type": ["null", "long"],
"logicalType": "timestamp-millis"
}
]
}
}
const metadata = {
sourceFormat: 'AVRO',
useAvroLogicalTypes: true
}
The only way I could get it work is by specifying the scema again in the metadata var for the load job. My question is, isn't the whole point of specifying the Avro schema and turning on useAvroLogicalTypes
to avoid having to explicitly spec the schema again the the load job metadata?
const metadata = {
sourceFormat: 'AVRO',
useAvroLogicalTypes: true,
schema: {
fields: [
...otherFields,
{
name: 'metadata',
type: 'STRUCT',
mode: 'REQUIRED',
fields: [
{ name: 'creationTime', type: 'TIMESTAMP', mode: 'REQUIRED' },
{ name: 'lastActivity', type: 'TIMESTAMP', mode: 'NULLABLE' },
{ name: 'deletionTime', type: 'TIMESTAMP', mode: 'NULLABLE' },
{ name: 'lastSignInTime', type: 'TIMESTAMP', mode: 'NULLABLE' }
]
}
]
}
}
Upvotes: 1
Views: 765
Reputation: 4051
Answering your question about specifying the Avro schema during creating the load job. It is possible to omit schema property, which is the schema for the destination table:
The schema can be omitted if the destination table already exists, or if you're loading data from Google Cloud Datastore.
If you want to read more about it, please refer to the documentation.
In addition, you can also take a look to the --autodetect flag which replaces the schema definition.
I hope it will help you.
Upvotes: 2