user2554323
user2554323

Reputation: 11

mysql distinct results after join

I'm having a lot of headaches with the following query. The problem I believe is with duplicate returned results from the AND portion. Mysql returns 14 results for below query, but realistically is is just 2 distinct results.

select date_format(published,'%Y-%m'),severity,count(severity) 
  FROM nvdcve 
  LEFT JOIN nvdproducts USING(cve_id) 
where (published >= '2013-06-01') 
  AND  (
      (nvdproducts.company='linux' and nvdproducts.product='linux_kernel 
         AND nvdproducts.version IN (SELECT Kernel from Versions.VIEW_kernel))
  OR (nvdproducts.company='apache' and nvdproducts.product='http_server' 
         AND nvdproducts.version IN (SELECT Httpd from Versions.VIEW_httpd))
  OR (nvdproducts.company='sendmail' and nvdproducts.product='sendmail'  AND nvdproducts.version IN (SELECT Sendmail from Versions.VIEW_sendmail))
  OR (nvdproducts.company='mysql' and nvdproducts.product='mysql' AND nvdproducts.version IN (SELECT Mysqld from Versions.VIEW_mysqld)) 
  OR (nvdproducts.company='proftpd' and nvdproducts.product='proftpd' AND nvdproducts.version IN (SELECT Proftpd from Versions.VIEW_proftpd))
  OR (nvdproducts.company='perl' and nvdproducts.product='perl' AND nvdproducts.version IN (SELECT Perl from Versions.VIEW_perl))
  OR (nvdproducts.company='openssl' and nvdproducts.product='openssl' AND nvdproducts.version IN (SELECT Sslinuse from Versions.VIEW_sslinuse))
 ) 
group by date_format(published,'%Y-%m'),severity;

This summary query gives this result.

+--------------------------------+----------+-----------------+
| date_format(published,'%Y-%m') | severity | count(severity) |
+--------------------------------+----------+-----------------+
| 2013-06                        | MEDIUM   |              14 |
+--------------------------------+----------+-----------------+

The closest I can get is getting the rows, but I lose the count and add cve_id, which is not what I want.

    select distinct cve_id,date_format(published,'%Y-%m'),severity 
      FROM nvdnew.nvdcve 
 LEFT JOIN nvdproducts USING(cve_id) 
     where (published > '2013-05-31') 
       and published < '2013-07-01' 
       AND (
           (nvdproducts.company='linux' and nvdproducts.product='linux_kernel' AND nvdproducts.version IN (SELECT Kernel from Versions.VIEW_kernel))
         OR (nvdproducts.company='apache' and nvdproducts.product='http_server' AND nvdproducts.version IN (SELECT Httpd from Versions.VIEW_httpd))
         OR (nvdproducts.company='sendmail' and nvdproducts.product='sendmail'  AND nvdproducts.version IN (SELECT Sendmail from Versions.VIEW_sendmail)) 
         OR (nvdproducts.company='mysql' and nvdproducts.product='mysql' AND nvdproducts.version IN (SELECT Mysqld from Versions.VIEW_mysqld))      
         OR (nvdproducts.company='proftpd' and nvdproducts.product='proftpd' AND nvdproducts.version IN (SELECT Proftpd from Versions.VIEW_proftpd))                                          
         OR (nvdproducts.company='perl' and nvdproducts.product='perl' AND nvdproducts.version IN (SELECT Perl from Versions.VIEW_perl))                                        
         OR (nvdproducts.company='openssl' and nvdproducts.product='openssl' AND nvdproducts.version IN (SELECT Sslinuse from Versions.VIEW_sslinuse)) 
   ) 
  order by published;

Here is the result.

+---------------+--------------------------------+----------+
| cve_id        | date_format(published,'%Y-%m') | severity |
+---------------+--------------------------------+----------+
| CVE-2013-2128 | 2013-06                        | MEDIUM   |
| CVE-2013-1862 | 2013-06                        | MEDIUM   |
+---------------+--------------------------------+----------+
2 rows in set (0.00 sec)

Upvotes: 0

Views: 86

Answers (1)

Barmar
Barmar

Reputation: 781058

Move all the nvdproducts checks into a subquery, so you can use SELECT DISTINCT there to prevent duplication.

SELECT DATE_FORMAT(published,'%Y-%m'), severity, COUNT(severity)
FROM nvdcve
LEFT JOIN (SELECT distinct cve_id
           FROM nvdproducts
           WHERE (nvdproducts.company='linux' and nvdproducts.product='linux_kernel' AND nvdproducts.version IN (SELECT Kernel from Versions.VIEW_kernel))
                 OR (nvdproducts.company='apache' and nvdproducts.product='http_server' AND nvdproducts.version IN (SELECT Httpd from Versions.VIEW_httpd))
                 OR (nvdproducts.company='sendmail' and nvdproducts.product='sendmail'  AND nvdproducts.version IN (SELECT Sendmail from Versions.VIEW_sendmail))
                 OR (nvdproducts.company='mysql' and nvdproducts.product='mysql' AND nvdproducts.version IN (SELECT Mysqld from Versions.VIEW_mysqld))
                 OR (nvdproducts.company='proftpd' and nvdproducts.product='proftpd' AND nvdproducts.version IN (SELECT Proftpd from Versions.VIEW_proftpd))
                 OR (nvdproducts.company='perl' and nvdproducts.product='perl' AND nvdproducts.version IN (SELECT Perl from Versions.VIEW_perl))
                 OR (nvdproducts.company='openssl' and nvdproducts.product='openssl' AND nvdproducts.version IN (SELECT Sslinuse from Versions.VIEW_sslinuse))
           ) nvdproducts
USING (cve_id)
WHERE published >= '2013-06-01'
GROUP BY DATE_FORMAT(published,'%Y-%m'), severity;

Upvotes: 1

Related Questions