John Thomas
John Thomas

Reputation: 1105

Using SQL to categorize a multiple rows of a column using Snowflake

A tricky conundrum I'm trying to figure out in Snowflake.

Let's say I have data like this

ID    tag
001     A
001     A
002     B
003     A
004     1
003     1
005     B
005     2
004     B
002     C
006     A
006     2
006     A

And basically, my goal is I would like to categorize each ID into a unique table on the following criteria. So across ALL of a given ID...

And if any other values appear, no issue, I only care about 1,2,A,B ; each ID will have a row with atleast one of these.

So the resulting DF will be...

ID      GROUP
001     GROUPA
002     GROUPB
003     GROUPA
004     NULL
005     GROUPB
006     NULL

Notice, 004 and 006 were nulled out because in 004 both 1 and B appeared. Similarly, even though A appeared twice in 006, the 2 does not match and thus is NULL.

Upvotes: 1

Views: 242

Answers (1)

Lukasz Szozda
Lukasz Szozda

Reputation: 175716

Using conditional aggregation, here: COUNT_IF:

SELECT 
    ID,
    CASE WHEN COUNT_IF(tag IN ('1','A')) > 0 AND COUNT_IF(tag IN ('2','B')) > 0 THEN NULL
         WHEN COUNT_IF(tag IN ('1','A')) > 0 THEN 'GROUPA'
         WHEN COUNT_IF(tag IN ('2','B')) > 0 THEN 'GROUPB'
    END AS grp
FROM tab
WHERE tag IN ('1', '2', 'A', 'B')
GROUP BY ID
ORDER BY ID;

Upvotes: 1

Related Questions