Reputation: 13
I have a transaction table and I need to group the similar records and for the class column which might be unique, I need to pick the top most value from the lookup table(class table) our of the values for the similar records ( loopup table is ordered based on the priority).
select * from class;
ID NAME
2 BETA
6 OMEGA
5 SIGMA
1 ALPHA1
3 GAMMA
4 DELTA
CREATE OR REPLACE FUNCTION "MIN_VALUE"(classlist array)
RETURNS VARCHAR(200)
LANGUAGE SQL
AS '
select NAME from CLASS
where ID in ( select min(ID) from CLASS
where NAME in (select value from table(flatten(input=>classlist))))
';
select * from T_DATA;
C_ID P_ID D_ID S_ID CLASS
1101111 1404 564 1404 BETA
1101111 1404 599 1425 ALPHA
1101111 1404 564 1404 OMEGA
1101111 1404 564 1425 ALPHA
1101111 1404 564 1404 GAMMA
1101111 1404 564 1425 GAMMA
1101111 1404 599 1425 GAMMA
1101111 1404 564 1425 OMEGA
When I write a query like below it works FINE
select MIN_VALUE(array_construct('OMEGA','GAMMA','BETA'));
When I use it in the actual query it fails with SQL compilation error: Unsupported subquery type cannot be evaluated
select C_ID, P_ID, D_ID, S_ID, MIN_VALUE(class_array) from (
select C_ID, P_ID, D_ID, S_ID, arrayagg(class) class_array
from t_data
group by C_ID,P_ID,D_ID,S_ID
);
OR
select C_ID,P_ID,D_ID,S_ID,MIN_VALUE(ca) from (
select C_ID,P_ID,D_ID,S_ID,array_construct(class_array) ca from (
select C_ID,P_ID,D_ID,S_ID,arrayagg(class) class_array
from t_data
group by C_ID,P_ID,D_ID,S_ID
)
);
I am expecting an output like below from the 8 records above
select C_ID,P_ID,D_ID,S_ID,array_construct(class_array) ca from (
select C_ID,P_ID,D_ID,S_ID,arrayagg(class) class_array
from t_data
group by C_ID,P_ID,D_ID,S_ID
);
Output
C_ID P_ID D_ID S_ID CLASS_ARRAY
1101111 1404 564 1404 ["OMEGA", "GAMMA", "BETA"]
1101111 1404 599 1425 ["ALPHA", "GAMMA"]
1101111 1404 564 1425 ["ALPHA", "GAMMA", "OMEGA"]
When I use the min_value function on the above class_array that will return a single value based on the priority in the lookup table.
C_ID P_ID D_ID S_ID CLASS_ARRAY
1101111 1404 564 1404 BETA
1101111 1404 599 1425 ALPHA
1101111 1404 564 1425 ALPHA
Please suggest some options to figure out why the function runs fine for hardcoded values but fails if the Array is constructed in the query and passed as parameter.
Upvotes: 1
Views: 1869
Reputation: 25938
This can also be done using the QUALIFY filter which allows filtering after the select stage and having that filter logic not show in the results.
with class as (
select * from values
(2, 'BETA'),
(6, 'OMEGA'),
(5, 'SIGMA'),
(1, 'ALPHA'),
(3, 'GAMMA'),
(4, 'DELTA')
v(id, name)
), t_data as (
select * from values
(1101111, 1404, 564, 1404, 'BETA'),
(1101111, 1404, 599, 1425, 'ALPHA'),
(1101111, 1404, 564, 1404, 'OMEGA'),
(1101111, 1404, 564, 1425, 'ALPHA'),
(1101111, 1404, 564, 1404, 'GAMMA'),
(1101111, 1404, 564, 1425, 'GAMMA'),
(1101111, 1404, 599, 1425, 'GAMMA'),
(1101111, 1404, 564, 1425, 'OMEGA')
v(C_ID, P_ID, D_ID, S_ID, CLASS)
)
select c_id, p_id, d_id, s_id, d.class
from t_data d
join class c on d.class = c.name
qualify row_number() over (partition by c_id, p_id, d_id, s_id order by c.id) = 1;
gives:
C_ID P_ID D_ID S_ID CLASS
1101111 1404 564 1404 BETA
1101111 1404 564 1425 ALPHA
1101111 1404 599 1425 ALPHA
which is the same as the more explicit/verbose form:
select c_id, p_id, d_id, s_id, class from (
select c_id, p_id, d_id, s_id, d.class
,row_number() over (partition by c_id, p_id, d_id, s_id order by c.id) as rn
from t_data d
join class c on d.class = c.name
)
where rn = 1;
which is really the same mechanics as Stuart's DISTINCT
If you are really want to do it via array's you can order the array on construction with WITHIN GROUP (ORDER BY ..)
and then you can take the first object, but FIRST_VALUE or QUALIFY methods should be faster.. but if there are other reasons to keep the array this might help
select C_ID, P_ID, D_ID, S_ID, class_array[0] ca from (
select C_ID, P_ID, D_ID, S_ID, arrayagg(class) within group (order by class.id) class_array
from t_data
join class on t_data.class = class.name
group by C_ID,P_ID,D_ID,S_ID
);
Upvotes: 0
Reputation: 1384
Snowflake has some limitations in its support for SQL statements that include some SELECT patterns within column definitions. There are a couple of ways to rewrite the query above to obtain the desired result:
1) Find the min ID and then join back to the class table:
with T as (
select C_ID, P_ID, D_ID, S_ID, min(class.id) minclassid
from t_data join class
on class.name = t_data.class
group by C_ID,P_ID,D_ID,S_ID
)
select C_ID, P_ID, D_ID, S_ID, class.name
from T join CLASS on minclassid = class.id;
2) Or use a Windowing function to grab the first class name ordered by ID within a group:
select distinct C_ID, P_ID, D_ID, S_ID,
first_value(class.name) over
(partition by C_ID, P_ID, D_ID, S_ID order by class.id) name
from t_data join class
on class.name = t_data.class;
Upvotes: 1