msleone
msleone

Reputation: 227

SELECTING data with CASE statement issue

I'm currently trying to formulate a query to determine whether certain software applications are installed on said host. Our table in our DB was set up in a way that there are columns containing the Host Name and Software so for instance there could be 10 records of Host A showcasing 10 individual pieces of software found on it. ex. Below

ID   Host          SW        
--------------------------
1    Host A        SW A
1    Host A        SW B     
1    Host A        SW C
2    Host B        SW B
2    Host B        SW C
3    Host C        SW F

What I'm trying to find are all the hosts that have software A installed(Flag Y/N) and software B installed(Flag Y/N) and format the results in a way that (using the example above) it would return...

Expected Results    

Host          SWA_Installed      SWB_Installed
----------------------------------------------
Host A        Y                  Y
Host B        N                  Y
Host C        N                  N

The query I have below is what I have so far (note actual DB code, those are actual SW names)...

SELECT DISTINCT
 HOST.a_display_label AS HOST,
 CASE 
  WHEN 
   sof.a_display_label = 'HPS' THEN 'Y' 
  ELSE 'N' 
 END AS HPS_INSTALLED,
 CASE 
  WHEN 
   sof.a_display_label = 'PowerToken' THEN 'Y' 
  ELSE 'N' 
 END AS PT_INSTALLED
FROM vcms.node_1 HOST
  JOIN vcms.installed_software_1 sof
    ON LOWER (sof.a_root_container) = LOWER (HOST.cmdb_id)

The issue is that even though it's grabbing DISTINCT results, it's still returning multiple records for hosts because of multiple host records of each software installed on it. Ultimately the results look like this:

Query Results

Host          SWA_Installed      SWB_Installed
----------------------------------------------
Host A        Y                  N
Host A        N                  Y
Host A        N                  N
Host B        N                  N
Host B        N                  Y
Host C        N                  N

I'm wondering what additional information I need to return the "expected results" data. Thank you in advance.

Upvotes: 4

Views: 98

Answers (3)

Vamsi Prabhala
Vamsi Prabhala

Reputation: 49270

You need to aggregate on the Host column. Use a max or min on the case expressions, because either of the results would be the same as the case expression returns one value per row.

SELECT 
 HOST.a_display_label AS HOST,
 MAX(CASE 
  WHEN 
   sof.a_display_label = 'HPS' THEN 'Y' 
  ELSE 'N' 
 END) AS HPS_INSTALLED,
 MAX(CASE 
  WHEN 
   sof.a_display_label = 'PowerToken' THEN 'Y' 
  ELSE 'N' 
 END) AS PT_INSTALLED
FROM vcms.node_1 HOST
  JOIN vcms.installed_software_1 sof
    ON LOWER (sof.a_root_container) = LOWER (HOST.cmdb_id)
GROUP BY HOST.a_display_label

The results would be reliable only when there is one row per a_display_label based on the join condition. Else you would still get multiple rows and aggregating in that case would be incorrect.

Upvotes: 1

Juan Carlos Oropeza
Juan Carlos Oropeza

Reputation: 48197

I just realize mine isnt as efficient as other answer but just for reference.

note this is sql server, but same work for oracle

SQL FIDDLE DEMO

SELECT  [Host],
        CASE WHEN 
             COUNT( CASE WHEN [SW] = 'SW A' THEN 1 END)  >= 1 THEN 'YES' 
                                                              ELSE 'NO' 
        END as sw_a,
        CASE WHEN 
             COUNT( CASE WHEN [SW] = 'SW B' THEN 1 END)  >= 1 THEN 'YES' 
                                                              ELSE 'NO' 
        END as sw_b

FROM hosts
GROUP BY [Host]

OUTPUT

enter image description here

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1270713

Simply use conditional aggregation:

select h.a_display_label as host,
       max(case when s.a_display_label = 'HPS' then 'Y' else 'N'
           end) as hps_installed,
       max(case when s.a_display_label = 'PowerToken' then 'Y' else 'N'
           end) as pt_installed
from vcms.node_1 h join
     vcms.installed_software_1 s
     on lower(s.a_root_container) = lower(h.cmdb_id)
group by h.a_display_label;

Note: This depends on the fact that 'Y' > 'N', so the logic would not work with min() instead of max(). I would normally just use "1" for yes and "0" for no, rather than characters.

I should also comment that I find SELECT DISTINCT with aggregation functions to be a very awkward construct. You should really be thinking in terms of GROUP BY rather than SELECT DISTINCT for aggregation queries.

Upvotes: 2

Related Questions