PowdyPowPow
PowdyPowPow

Reputation: 132

Sum over multiple levels of nested repeated fields

I have several order-detail tables in the source database: Order Header -> Order Line -> Shipped Line -> Received Line

I create a BQ table with two levels of nested repeated fields. Here is how some sample data looks like:

WITH stol as (
SELECT 1 AS stol_id, "stol-1.1" AS stol_number, 1 AS stol_transfer_order_line_id, 3 AS stol_quantity  
 UNION ALL
 SELECT 2 AS stol_id, "stol-2.1" AS stol_number, 2 AS stol_transfer_order_line_id, 2 AS stol_quantity 
 UNION ALL
 SELECT 3 AS stol_id, "stol-2.2" AS stol_number, 2 AS stol_transfer_order_line_id, 2 AS stol_quantity
 UNION ALL
 SELECT 4 AS stol_id, "stol-2.3" AS stol_number, 2 AS stol_transfer_order_line_id, 1 AS stol_quantity 
),

rtol as (
SELECT 1 AS stol_id, "rtol-1.1" as rtol_number, 2 as rtol_quantity
UNION ALL
SELECT 1 as stol_id, "rtol-1.2" as rtol_number, 1 AS rtol_quantity
UNION ALL
SELECT 2 as stol_id, "rtol-2.1" as rtol_number, 2 AS rtol_quantity
UNION ALL
SELECT 3 as stol_id, "rtol-2.2" as rtol_number, 1 AS rtol_quantity
),

tol as (
SELECT 1 as tol_id, "tol-1" as tol_number, 3 as tol_transfer_quantity
UNION ALL
SELECT 2 as tol_id, "tol-2" AS tol_number, 5 AS tol_transfer_quantity
),

nest AS (
 SELECT s.stol_id,
        s.stol_number,
        s.stol_quantity,
        s.stol_transfer_order_line_id,
        ARRAY_AGG(STRUCT(r.rtol_number, r.rtol_quantity)) as received
 FROM stol s
 LEFT JOIN rtol r ON s.stol_id = r.stol_id
 GROUP BY 1, 2, 3, 4
),

final as (
SELECT t.tol_id
  ,t.tol_number
  ,t.tol_transfer_quantity
  ,ARRAY_AGG(STRUCT(n.stol_number, n.stol_quantity, n.received)) as shipped
FROM tol t
LEFT JOIN nest n ON t.tol_id = n.stol_transfer_order_line_id
GROUP BY 1, 2, 3
)

I want to sum the shipped and received quantities for each order line. I can get the correct result like so:

shipped as (
SELECT tol_number
  ,SUM(stol_quantity) as shipped_q
FROM final t, t.shipped
GROUP BY 1
),

received as (
SELECT tol_number
  ,SUM(rtol_quantity) as received_q
FROM final t, t.shipped s, s.received
GROUP BY 1
)

SELECT t.tol_number
  ,t.tol_transfer_quantity
  ,s.shipped_q
  ,r.received_q
FROM final t
LEFT JOIN shipped s on t.tol_number = s.tol_number
LEFT JOIN received r ON t.tol_number = r.tol_number

Correct results:

Row tol_number  tol_transfer_quantity   shipped_q   received_q   
1     tol-1             3                  3            3    
2     tol-2             5                  5            3

What i am wondering is if there is a better way to do this? Trying something like this will over count the first level of nesting but just feels and looks a lot cleaner:

SELECT tol_number
      ,tol_transfer_quantity
      ,SUM(stol_quantity) as shipped_q
      ,SUM(rtol_quantity) as shipped_r
FROM final t, t.shipped s, s.received
GROUP BY 1, 2

Wrong result for shipped_q:

Row tol_number  tol_transfer_quantity   shipped_q   shipped_r    
1     tol-2              5                 5            3    
2     tol-1              3                 6            3

Many thanks for any ideas.

Upvotes: 3

Views: 1029

Answers (2)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173046

#standardSQL
SELECT
  tol_id,
  tol_transfer_quantity,
  (SELECT SUM(stol_quantity) FROM final.shipped) shipped_q,
  (SELECT SUM(rtol_quantity) FROM final.shipped s, s.received) shipped_r
FROM final

Upvotes: 3

Martin Weitzmann
Martin Weitzmann

Reputation: 4746

I'd suggest you use sub-selects in which you treat your arrays like tables:

SELECT
  tol_id,
  SUM(tol_transfer_quantity),
  SUM( (SELECT SUM(stol_quantity) FROM final.shipped) ) shipped_q,
  SUM( (SELECT SUM(rtol_quantity) FROM final.shipped s, s.received) ) shipped_r
FROM
  final
GROUP BY
  1

hth!

Upvotes: 2

Related Questions