Reputation: 525
I have a snowflake array as below rows which is an input, which I would want to check for each value in the array value and spit as multiple output arrays based on the value's length for values with 5 digits as one column, and values with 6 digits as another column.
ID_COL,ARRAY_COL_VALUE
1,[22,333,666666]
2,[1,55555,999999999]
3,[22,444]
Output table:
ID_COL,FIVE_DIGIT_COL,SIX_DIGIT_COL
1,[],[666666]
2,[555555],[]
3,[],[]
Please let me know if we could iterate through each array value and perform SQL aggregation to check column length and then output as a separate column outputs. Creating it through SQL would be great, but UDFs using javascript, python if an option would also be great.
Upvotes: 0
Views: 1036
Reputation: 5803
If you're dealing strictly with numbers here's another way
with cte (id, array_col) as
(select 1,[22,333,666666,666666] union all
select 2,[1,22222,55555,999999999] union all
select 3,[22,444])
select *,
concat(',',array_to_string(array_col,',,'),',') as str_col,
regexp_substr_all(str_col,',([^,]{5}),',1,1,'e') as len_5,
regexp_substr_all(str_col,',([^,]{6}),',1,1,'e') as len_6
from cte;
The basic idea is to turn that array into a string and keep all the digits surrounded by ,
so that we can parse the pattern using regex_substr_all
.
If you're dealing with strings, you can modify it to use a delimiter that won't show up in your data.
Upvotes: 0
Reputation: 175726
Using SQL and FLATTEN:
CREATE OR REPLACE TABLE t(ID_COL INT,ARRAY_COL_VALUE VARIANT)
AS
SELECT 1,[22,333,666666] UNION ALL
SELECT 2,[1,55555,999999999] UNION ALL
SELECT 3,[22,444];
Query:
SELECT ID_COL,
ARRAY_AGG(CASE WHEN s.value BETWEEN 10000 AND 99999 THEN s.value END) AS FIVE_DIGIT_COL,
ARRAY_AGG(CASE WHEN s.value BETWEEN 100000 AND 999999 THEN s.value END) AS SIX_DIGIT_COL
FROM t, TABLE(FLATTEN(ARRAY_COL_VALUE)) AS s
GROUP BY ID_COL;
And Python UDF:
create or replace function filter_arr(arr variant, num_digits INT)
returns variant
language python
runtime_version = 3.8
handler = 'main'
as $$
def main(arr, num_digits):
return [x for x in arr if len(str(x))==num_digits]
$$;
SELECT ID_COL,
ARRAY_COL_VALUE,
filter_arr(ARRAY_COL_VALUE, 5),
filter_arr(ARRAY_COL_VALUE, 6)
FROM t;
Output:
Upvotes: 1