Kare
Kare

Reputation: 11

Presto SQL - Trying to pull data from multiple columns into one entry to find a unique, missing, or dupe entry

new to SQL/Presto here.
Feel free to point out the obvious if needed.

I have a sub query that pulls data into a table like below.
For each ItemID, 1 would mean that the tag is on, 0 is off.

I am trying to make a query that would pull up each ItemID with its associated tag if its unique, otherwise point out if there is more than one or if its missing.

Data_Table
| ItemID | TagA | TagB | TagC | TagD | TagE |
|  111   |  1   |  1   |  0   |  0   |  0   | 
|  222   |  1   |  1   |  1   |  0   |  0   | 
|  333   |  1   |  1   |  0   |  0   |  0   | 
|  444   |  0   |  1   |  0   |  0   |  0   | 
|  555   |  0   |  0   |  0   |  0   |  0   | 
|  666   |  0   |  0   |  0   |  1   |  1   | 

I tried a case when statement that pull each 1 and another case query that tries to convert each column into just one row entry.

SELECT Item_ID,
   CASE WHEN (Tag_A+Tag_B+Tag_C+Tag_D+Tag_E > 1) THEN 'Dupe'
     ELSE (CASE WHEN Tag_A = 1 THEN 'TagA_Present'
           WHEN Tag_B = 1 THEN 'TagB_Present'
           WHEN Tag_C = 1 THEN 'TagC_Present'
           WHEN Tag_D = 1 THEN 'TagD_Present'
           WHEN Tag_E = 1 THEN 'TagE_Present'
           ELSE 'Missing_Tag' END)
     END as ItemTag
FROM Data_Table

EDITED - I went too far with the sample data and initial query has been changed.

Actual Results
| ItemID | ItemTag |
|  111   |     Dupe     | 
|  222   | TagA_Present | 
|  333   | TagB_Present | 
|  444   | TagB_Present | 
|  555   |    Missing   | 
|  666   | TagD_Present | 

ItemID 111, 222, 333, and 666 should all be 'Dupe', but the results seems to be deeming random ones unique.

Upvotes: 0

Views: 999

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269923

Hmmm. I am thinking:

select t.itemId,
       (case when (TagA + TagB + TagC + TagD + TagE) > 1 then 'Dupe'
             when TagA = 1 then 'TagA'
             when TagB = 1 then 'TagB'
             when TagC = 1 then 'TagC'
             when TagD = 1 then 'TagD'
             when TagE = 1 then 'TagE'
             else 'Missing'
        end) as ItemTag
from Data_Table;

There is no reason to use aggregation for this.

Upvotes: 1

Related Questions