Joseph Yourine
Joseph Yourine

Reputation: 1331

Update twice nested repeated record

I'm struggling with this query (dummy version, there are much more fields in it) :

UPDATE 
  table1 as base
SET 
  lines = 
    ARRAY(
          SELECT AS STRUCT 
            b.line_id,
            s.purch_id,
            ARRAY(
                  SELECT AS STRUCT
                    wh.warehouse_id,
                    s.is_proposed,
                  FROM table1 as t, UNNEST(lines) as lb, UNNEST(lb.warehouses) as wh
                  INNER JOIN 
                    (SELECT 
                      l.line_id,
                      wh.is_proposed
                     FROM table2, UNNEST(lines) as l, UNNEST(l.warehouses) as wh) as s
                  ON lb.line_id = s.line_id AND wh.warehouse_id = s.warehouse_id)
          FROM table1, UNNEST(lines) as b
          INNER JOIN UNNEST(supply.lines) as s
          ON b.line_id = s.line_id)
FROM 
  table2 as supply
WHERE 
  base.date = supply.date
  AND
  base.sales_id = supply.sales_id

table1 and table2 have the same nesting :

(so {... , lines [{... warehouses [)

Plus table1 is a subset of table2 with a subset of its fields, table1 having them NULL from start (I refresh information when data is available because informations are asynchroneous).

I first tried this as a first step (which succeeds) :

UPDATE 
  table1 as base
SET 
  lines = 
    ARRAY(
          SELECT AS STRUCT 
            b.line_id,
            s.purch_id,
            b.warehouses
          FROM table1, UNNEST(lines) as b
          INNER JOIN UNNEST(supply.lines) as s
          ON b.line_id = s.line_id)
FROM 
  table2 as supply
WHERE 
  base.date = supply.date
  AND
  base.sales_id = supply.sales_id

But the fact I actually need to update lines.warehouses too so I'm happy it works but it is not enough.

The full query is valid and when I try the last ARRAY part in a terminal, the query is fast and the output has no duplicate. Still the complete UPDATE does not end (after 20 min, I killed it).

Tables are not that large, 20k from both side (220k completely flattened).

So am I doing something wrong ? Is there a better way ?

Thanks

Upvotes: 0

Views: 120

Answers (1)

Joseph Yourine
Joseph Yourine

Reputation: 1331

I finally solved the issue, Iit was way simpler than I thought. I think I misunderstood how the whole query nesting worked.

So I just linked every data avaiable, from the first singular row matched to the last array since filtered data at top level is propagated to lower levels.

UPDATE 
  table1 as base
SET 
  lines = 
    ARRAY(
          SELECT AS STRUCT 
            b.line_id,
            s.purch_id,
            ARRAY(
                  SELECT AS STRUCT
                    wh.warehouse_id,
                    sh.is_proposed,
                  FROM UNNEST(b.warehouses) as wh -- take only upper level data
                  INNER JOIN UNNEST(s.warehouses) as sh -- idem
                  ON wh.warehouse_id = sh.warehouse_id) -- no need to 'redo' the joining on already filtered ones
          FROM UNNEST(base.lines) as b
          INNER JOIN UNNEST(supply.lines) as s
          ON b.line_id = s.line_id)
FROM 
  table2 as supply
WHERE 
  base.date = supply.date
  AND
  base.sales_id = supply.sales_id

The query succeeds in less than 1min

Upvotes: 1

Related Questions