Reputation: 1574
There is a Big Query table with nested columns, the DDL for the table is:
CREATE TABLE stg.raw_data.user_events
(
pipeline_metadata STRUCT<uuid STRING, timestamp TIMESTAMP, restream_count INT64, pubsub_subscription_name STRING>,
event_name STRING,
insertId STRING,
eventID STRING,
timestamp TIMESTAMP,
metadata STRUCT<userID INT64, uuid STRING, sessionID STRING, platform STRING, prime_status STRING, created_at TIMESTAMP, city STRING, country STRING, embed_url STRING, embed_domain STRING, deviceID STRING, device_type STRING, state STRING, metadata_schema_version STRING, schema_version STRING, ipAddress STRING>,
properties STRING
)
PARTITION BY DATE(timestamp);
Sample data from this table looks like this:
[{
"pipeline_metadata": {
"uuid": "d2aae738-ddbe-43fc-b1a0-a0a3d7f4a66c",
"timestamp": "2022-10-06 00:44:53.804000 UTC",
"restream_count": "0",
"pubsub_subscription_name": "raw-user-events-data-ingestor-subscription"
},
"event_name": "search",
"insertId": "5751196922008325",
"eventID": "rCz_ZG70T-CYlJ7uAgWhvxOeRKqoevLb",
"timestamp": "2022-10-06 00:44:52.792000 UTC",
"metadata": {
"userID": "2235338",
"uuid": "3cf13e3f499339f45bc1a344bd5c83866ebae85b6c89255ff203467552157aaa",
"sessionID": "140b56af96c6331e8e5cc88721143d788a30b3c35fe28dac2c2e3ceeb1ee4948",
"platform": "web-app",
"prime_status": "subscription",
"created_at": "2022-10-06 00:44:52.767000 UTC",
"city": "Woodside",
"country": "United States",
"embed_url": null,
"embed_domain": null,
"deviceID": "db0f3955624d5541b587ac4661a25dfd",
"device_type": "desktop",
"state": "New York",
"metadata_schema_version": "latest",
"schema_version": "latest",
"ipAddress": null
},
"properties": "{\"action\":\"focus-on-input\",\"source\":\"dashboard\",\"page_path\":\"/home/dashboard\"}"
}]
I am trying to create a similar table in Snowflake, the nested column structure should be retained in Snowflake, because this is a migration from Big Query to Snowflake, hence the structure should be the same between both the warehouses.
I was able to prepare the table DDL for Snowflake but columns are not nested similar to the Big Query table.
CREATE or replace TABLE test.dbt."USER_EVENTS"
(
pipeline_metadata variant,
event_name STRING,
insertId STRING,
eventID STRING,
time_stamp TIMESTAMP,
metadata variant,
properties STRING
)
CLUSTER BY (DATE(time_stamp));
How can I build a DDL for Snowflake table, where the structure of the table is similar to that of the Big Query table.
Upvotes: 0
Views: 360
Reputation: 3901
Depending on how you define "similar" (between the DBMSes where you don't have a direct correspondence between a given feature), you could simply flatten out the STRUCTs
CREATE TABLE stg.raw_data.user_events
(
uuid STRING,
meta_timestamp TIMESTAMP,
restream_count INT,
pubsub_subscription_name STRING,
event_name STRING,
insertId STRING,
eventID STRING,
time_stamp TIMESTAMP,
userID INT,
useer_uuid STRING,
sessionID STRING,
platform STRING,
prime_status STRING,
created_at TIMESTAMP,
city STRING,
country STRING,
embed_url STRING,
embed_domain STRING,
deviceID STRING,
device_type STRING,
state STRING,
metadata_schema_version STRING,
schema_version STRING,
ipAddress STRING,
properties VARIANT
)
CLUSTER BY (time_stamp)
that is certainly somewhat similar.. if it is similar enough is not for us to decide.
Upvotes: 1