Tobias Hermann
Tobias Hermann

Reputation: 10946

How to merge different schemas of structs in arrays (filling missing columns with null)?

Given these one tables (a being an array of structs).

baz_v1 (a being ARRAY<STRUCT<x INT64>>):

+===========+
| a.x | b   |
+===========+
| 1   | one |
| 2   |     |
+===========+

baz_v2 (a being ARRAY<STRUCT<x INT64, y INT64>>):

+=================+
| a.x | a.z | b   |
+=================+
| 3   | 4   | one |
| 5   | 6   |     |
+-----------------+
| 7   | 8   | two |
| 9   | 0   |     |
+-----------------+
| 11  | 12  | two |
| 13  | 14  |     |
+=================+

How can I obtain the following (concatenated) table/view?

+==================+
| a.x | a.y  | b   |
+==================+
| 1   | null | one |
| 2   | null |     |
+------------------+
| 3   | 4    | one |
| 5   | 6    |     |
+------------------+
| 7   | 8    | two |
| 9   | 10   |     |
+------------------+
| 11  | 12   | two |
| 13  | 14   |     |
+==================+

Code:

WITH `baz_v1` AS (
    SELECT
    [
        STRUCT(1 AS x),
        STRUCT(2 AS x)
    ]
    a,
    "one" b
), `baz_v2` AS (
    SELECT
    [
        STRUCT(3 AS x, 4 AS y),
        STRUCT(5 AS x, 6 AS y)
    ]
    a,
    "one" b
    UNION ALL
    SELECT
    [
        STRUCT(7 AS x, 8 AS y),
        STRUCT(9 AS x, 10 AS y)
    ]
    a,
    "two" b
    UNION ALL
    SELECT
    [
        STRUCT(11 AS x, 12 AS y),
        STRUCT(13 AS x, 14 AS y)
    ]
    a,
    "two" b
)

-- todo: Insert magic here, because the below, of course, does not work.
SELECT * FROM baz_v2
UNION ALL
SELECT * FROM baz_v1

Upvotes: 1

Views: 573

Answers (3)

Tobias Hermann
Tobias Hermann

Reputation: 10946

Building on the very good answer given by Mikhail Berlyant, I've found another solution, one that does not use SELECT * REPLACE:

SELECT ARRAY(SELECT AS STRUCT x, NULL AS y FROM baz_v1.a) AS a, b FROM baz_v1
UNION ALL
SELECT * FROM baz_v2

Upvotes: 1

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173028

Consider below

select * replace(
  array(select as struct x, null as y from t.a) as a)
from `baz_v1` t
union all 
select * from `baz_v2`                 

if applied to sample data in y our question - output is

enter image description here

Upvotes: 2

JongWon Lee
JongWon Lee

Reputation: 135

The following method was used to merge these tables.

(1) To use an union all the target tables, array type was flattened.

(2) In order to match the number of columns in the UNION ALL target tables, the number of columns is appended as much as the insufficient number of columns by LEFT JOIN (SELECT '' as y) ON FALSE - reference

(3) ARRAY_AGG, STRUCT, GROUP BY were used to output the array result. - reference

WITH `baz_v1` AS (
    SELECT
    [
        STRUCT(1 AS x),
        STRUCT(2 AS x)
    ] a,
    "one" b
), `baz_v2` AS (
    SELECT
    [
        STRUCT(3 AS x, 4 AS y),
        STRUCT(5 AS x, 6 AS y)
    ] a,
    "two" b
)

SELECT ARRAY_AGG (
    STRUCT(x, y) 
    ) as a,
    b
FROM (
    SELECT xy.x as x, xy.y as y, b
    FROM baz_v2, UNNEST(a) as xy
    UNION ALL
    SELECT x, y, b
    FROM (
        SELECT x.x as x, CAST(y as INT64) as y, b
        FROM baz_v1, UNNEST(a) as x
        LEFT JOIN (SELECT '' as y) ON FALSE
    ) 
)
GROUP BY b

with result)

enter image description here

Upvotes: 0

Related Questions