Teja Goud Kandula
Teja Goud Kandula

Reputation: 1574

Convert Big Query DDL to Snowflake DDL

There is a Big Query table with nested columns, the DDL for the table is:

CREATE TABLE stg.raw_data.user_events
(
pipeline_metadata STRUCT<uuid STRING, timestamp TIMESTAMP, restream_count INT64, pubsub_subscription_name STRING>,
event_name STRING,
insertId STRING,
eventID STRING,
timestamp TIMESTAMP,
metadata STRUCT<userID INT64, uuid STRING, sessionID STRING, platform STRING, prime_status STRING, created_at TIMESTAMP, city STRING, country STRING, embed_url STRING, embed_domain STRING, deviceID STRING, device_type STRING, state STRING, metadata_schema_version STRING, schema_version STRING, ipAddress STRING>,
properties STRING
)
PARTITION BY DATE(timestamp);

Sample data from this table looks like this:

[{
  "pipeline_metadata": {
    "uuid": "d2aae738-ddbe-43fc-b1a0-a0a3d7f4a66c",
    "timestamp": "2022-10-06 00:44:53.804000 UTC",
    "restream_count": "0",
    "pubsub_subscription_name": "raw-user-events-data-ingestor-subscription"
  },
  "event_name": "search",
  "insertId": "5751196922008325",
  "eventID": "rCz_ZG70T-CYlJ7uAgWhvxOeRKqoevLb",
  "timestamp": "2022-10-06 00:44:52.792000 UTC",
  "metadata": {
    "userID": "2235338",
    "uuid": "3cf13e3f499339f45bc1a344bd5c83866ebae85b6c89255ff203467552157aaa",
    "sessionID": "140b56af96c6331e8e5cc88721143d788a30b3c35fe28dac2c2e3ceeb1ee4948",
    "platform": "web-app",
    "prime_status": "subscription",
    "created_at": "2022-10-06 00:44:52.767000 UTC",
    "city": "Woodside",
    "country": "United States",
    "embed_url": null,
    "embed_domain": null,
    "deviceID": "db0f3955624d5541b587ac4661a25dfd",
    "device_type": "desktop",
    "state": "New York",
    "metadata_schema_version": "latest",
    "schema_version": "latest",
    "ipAddress": null
  },
  "properties": "{\"action\":\"focus-on-input\",\"source\":\"dashboard\",\"page_path\":\"/home/dashboard\"}"
}]

I am trying to create a similar table in Snowflake, the nested column structure should be retained in Snowflake, because this is a migration from Big Query to Snowflake, hence the structure should be the same between both the warehouses.

I was able to prepare the table DDL for Snowflake but columns are not nested similar to the Big Query table.

CREATE or replace TABLE test.dbt."USER_EVENTS"
(
pipeline_metadata variant,
event_name STRING,
insertId STRING,
eventID STRING,
time_stamp TIMESTAMP,
metadata variant,
properties STRING
)
CLUSTER BY (DATE(time_stamp));

How can I build a DDL for Snowflake table, where the structure of the table is similar to that of the Big Query table.

Upvotes: 0

Views: 360

Answers (1)

Paul Vernon
Paul Vernon

Reputation: 3901

Depending on how you define "similar" (between the DBMSes where you don't have a direct correspondence between a given feature), you could simply flatten out the STRUCTs

CREATE TABLE stg.raw_data.user_events
(
  uuid STRING,
  meta_timestamp TIMESTAMP,
  restream_count INT,
  pubsub_subscription_name STRING,
  event_name STRING,
  insertId STRING,
  eventID STRING,
  time_stamp TIMESTAMP,
  userID INT, 
  useer_uuid STRING,
  sessionID STRING,
  platform STRING,
  prime_status STRING,
  created_at TIMESTAMP,
  city STRING,
  country STRING,
  embed_url STRING,
  embed_domain STRING,
  deviceID STRING,
  device_type STRING,
  state STRING,
  metadata_schema_version STRING,
  schema_version STRING,
  ipAddress STRING,
  properties VARIANT
)
 CLUSTER BY (time_stamp)

that is certainly somewhat similar.. if it is similar enough is not for us to decide.

Upvotes: 1

Related Questions