Dante
Dante

Reputation: 45

Redshift. Convert comma delimited values into rows with all combinations

I have:

user_id|user_name|user_action
-----------------------------
1      | Shone   | start,stop,cancell

I would like to see:

user_id|user_name|parsed_action 
------------------------------- 
1      | Shone   | start 
1      | Shone   | start,stop 
1      | Shone   | start,cancell
1      | Shone   | start,stop,cancell       
1      | Shone   | stop         
1      | Shone   | stop,cancell
1      | Shone   | cancell      
....

Upvotes: 0

Views: 614

Answers (1)

AlexYes
AlexYes

Reputation: 4208

You can create the following Python UDF:

create or replace function get_unique_combinations(list varchar(max))
returns varchar(max)
stable as $$

from itertools import combinations

arr = list.split(',')

response = []

for L in range(1, len(arr)+1):
    for subset in combinations(arr, L):
        response.append(','.join(subset))

return ';'.join(response)

$$ language plpythonu;

that will take your list of actions and return unique combinations separated by semicolon (elements in combinations themselves will be separated by commas). Then you use a UNION hack to split values into separate rows like this:

WITH unique_combinations as (
    SELECT 
     user_id
    ,user_name
    ,get_unique_combinations(user_actions) as action_combinations
    FROM your_table
)
,unwrap_lists as (
    SELECT 
     user_id
    ,user_name
    ,split_part(action_combinations,';',1) as parsed_action
    FROM unique_combinations
    UNION ALL
    SELECT 
     user_id
    ,user_name
    ,split_part(action_combinations,';',2) as parsed_action
    FROM unique_combinations
    -- as much UNIONS as possible combinations you have for a single element, with the 3rd parameter (1-based array index) increasing by 1
    )
SELECT *
FROM unwrap_lists
WHERE parsed_action is not null

Upvotes: 1

Related Questions