BigD
BigD

Reputation: 888

Insert into Nested records in Bigquery FROM another nested table

I am trying to insert data from one Bigquery table (nested) to another bigquery table (nested). However, I am getting issues during insert.

Source schema: T1

FieldName                     Type            Mode
 User                         STRING          NULLABLE
 order                        RECORD          REPEATED
   order.Name                 STRING          NULLABLE
   order.location             STRING          NULLABLE
   order.subscription         RECORD          NULLABLE
   order.subscription.date    TIMESTAMP       NULLABLE
   order.Details              RECORD          REPEATED
     order.Details.id         STRING          NULLABLE
     order.Details.nextDate   STRING          NULLABLE

Target schema: T2

FieldName                    Type            Mode
 User                        STRING          NULLABLE
 order                       RECORD          REPEATED
   order.Name                STRING          NULLABLE
   order.location            STRING          NULLABLE
   order.subscription        RECORD          NULLABLE
   order.subscription.date   TIMESTAMP       NULLABLE
   order.Details             RECORD          REPEATED
     order.Details.id        STRING          NULLABLE
     order.Details.nextDate  STRING          NULLABLE

I am trying to use insert into functionality of bigquery. I am looking to insert only few field from source table. My query is like below:

INSERT INTO T2 (user,order.name,order.subscription.date,details.id) 
 SELECT user,order.name,order.subscription.date,details.id 
 from 
   T1 o 
   join unnest  (o.order) order, 
   unnest ( order.details)  details

After a bit of googling I am aware that I would need to use STRUCT when defining field names while inserting, but not sure how to do it. Any help is appreciated. Thanks in advance!

Upvotes: 1

Views: 2491

Answers (1)

Betjens
Betjens

Reputation: 1401

You will have to insert the records as per is needed in your destination table, Struct types need to be inserted fully ( with all the records that it contains ).

I provide a small sample below, I build the following table with a single record to explain this:

create or replace table `project-id.dataset-id.table-source` (
    user STRING,
    order_detail STRUCT<name STRING, location STRING,subscription STRUCT<datesub TIMESTAMP>,details STRUCT<id STRING,nextDate STRING>>
)

insert into  `project-id.dataset-id.table-source` (user,order_detail)
values ('Karen',STRUCT('ShopAPurchase','Germany',STRUCT('2022-03-01'),STRUCT('1','2022-03-05')))

With that information we can now star inserting into our destination tables. In our sample, I'm reusing the source table and just adding an additional record into it like this:

insert into  `project-id.dataset-id.table-source` (user,order_detail)
select 'Anna',struct(ox.name,'Japan',ox.subscription,struct('2',dx.nextDate)) 
from  `project-id.dataset-id.table-source` o
join unnest ([o.order_detail]) ox, unnest ([o.order_detail.details])  dx

You will see that in order to perform an unnesting structs I will have to add the value inside an array []. As unnest flatens the struct as a single row. Also, when inserting struct types you will also have to create the struct or use the flattening records to create that struct column.

If you want to add additional records inside a STRUCT you will have to declare your destination table with an ARRAY inside of it. Lets look at this new table source_array:

create or replace table `project-id.dataset-id.table-source_array` (
    user STRING,
    order_detail STRUCT<name STRING, location STRING,subscription STRUCT<datesub TIMESTAMP>,details ARRAY<STRUCT<id STRING ,nextDate STRING>>>
)

insert into  `project-id.dataset-id.table-source_array` (user,order_detail)
values ('Karen',STRUCT('ShopAPurchase','Germany',STRUCT(['2022-03-01']),STRUCT('1','2022-03-05')))
insert into  `project-id.dataset-id.table-source_array` (user,order_detail)
select 'Anna',struct(ox.name,'Japan',ox.subscription,[struct('2',dx.nextDate),struct('3',dx.nextDate)]) 
from  `project-id.dataset-id.table-source` o
join unnest ([o.order_detail]) ox, unnest ([o.order_detail.details])  dx

Keep in mind that you should be careful as when dealing with this as you might encounter subarrays error which may cause issues.

I make use of the following documentation for this sample:

Upvotes: 3

Related Questions