Reputation: 383
How can i split an array of values in a column into corresponding rows in Redshift using a delimiter (,) ?
Input Data:-
—————————————
Empid | Items
—————————————
1001| A, B
1002| B
1003| C, D, E
Required Output:-
—————————————
Empid | Items
—————————————
1001| A
1001| B
1002| B
1003| C
1003| D
1003| E
Any help is appreciated.
Thanks
Upvotes: 4
Views: 14371
Reputation: 403
Based on the official docs, you can do with JOIN
!
Let's say your input is:
—————————————
empid | items
—————————————
1001| [A, B]
1002| [B]
1003| [C, D, E]
1004| []
Then you can do it as:
SELECT t.empid, items as item
FROM table_name AS t
LEFT JOIN t.items AS items ON TRUE
This will returns:
—————————————
empid | item
—————————————
1001| A
1001| B
1002| B
1003| C
1003| D
1003| E
1004| <NULL>
Upvotes: 12
Reputation: 3455
Actually with the addition of stored procedures to Redshift this is possible
The procedure below accepts two parameters (source_table
and target_table
)
assuming both table exists it transforms the data described in the question
The way it works is
CREATE OR REPLACE PROCEDURE Array_to_Rows(source_table VARCHAR, target_table VARCHAR)
LANGUAGE plpgsql
AS $$
DECLARE i INTEGER;
rec RECORD;
query VARCHAR;
item VARCHAR;
cnt INTEGER;
BEGIN
query := 'SELECT * FROM ' || source_table;
FOR rec IN EXECUTE query
LOOP
select INTO cnt regexp_count(rec.items,',')+1;
i := 1;
<< items_loop >>
LOOP
SELECT INTO item trim(split_part(rec.items,',',i));
EXECUTE 'INSERT INTO ' || target_table || ' values (' || rec.Empid || ',''' || item ||''')';
i := i + 1;
EXIT items_loop WHEN (i > cnt);
END LOOP;
END LOOP;
END;
$$
Usage: CALL Array_to_Rows('source table name
','target table name
')
With test data in the question it took less than 0.2 seconds, don't know how big OPs data set is
Output is
Empid item
1001 A
1001 B
1002 B
1003 C
1003 D
1003 E
Upvotes: 0