Sam B
Sam B

Reputation: 23

RedShift self-join on array

I have a table within Redshift with a self-referential ParentID to join back to the ProductID to create a hierarchy.

Basic structure (full table is c6k records, with arrays up to 90 IDs):

CREATE TABLE sandbox.productmk2 (
    productid integer ENCODE az64,
    product character varying(500) ENCODE lzo,
    parentid super
)
DISTSTYLE AUTO;
ProductID Product ParentID
1 Headwear [0]
2 Woolwear [0]
3 Hats [1,2]

ParentID is stored as a SUPER type and contains an array.

From a bit of research online, it has been suggested to use the ANY condition. However, I get the following error, even if I cast it again as an array.

ERROR: op ANY/ALL (array) requires array on right side,

    SELECT a.ProductID, a.product, b.product as Parent
    FROM sandbox.productmk2 A
    JOIN sandbox.productmk2 b on b.productid = ANY(a.parentid)

Expected output

ProductID Product Parent
3 Hats Headwear
3 Hats Woolwear

Changing the ANY(a.parentid) to a.parentid[0] successfully joins on the first ID, so the array looks to be stored correctly.

Is there something I'm missing regarding the ALL operator, or is there a better way to deal with this?

Unfortunately, it's challenging to research, considering this is a relatively new feature in Redshift, and most articles I come across say arrays aren't supported in RS...

Thanks in advance

Upvotes: 0

Views: 232

Answers (2)

Sam B
Sam B

Reputation: 23

I have found a way around to get a working solution, although it is far from elegant (I'm not a fan of joining in the where clause).

If anybody can improve on this, I'll be very keen to learn!

SELECT A.*, C.PRODUCT AS Parent
FROM sandbox.productmk2 AS A , a.parentid AS b, sandbox.productmk2 AS C
WHERE B = C.productid;

Upvotes: 0

Lenroy
Lenroy

Reputation: 341

It seems like you're dealing with array handling in Redshift. Instead of using the ANY condition, try using UNNEST to expand the arrays and then join on the elements

SELECT a.ProductID, a.product, b.product as Parent
FROM sandbox.productmk2 a
JOIN sandbox.productmk2 b ON b.productid = 
UNNEST(a.parentid)

Upvotes: 0

Related Questions