TaylorPLM
TaylorPLM

Reputation: 101

MySQL - Conflicting WHERE and GROUP BY Statements

I have a query returning the counts of some data, but I do NOT want data that has a null value in it...

As an example, the code rolls stats from a clicking system into a table.

SELECT sh.dropid,
COUNT(DISTINCT IF(sh2.`SentDate` IS NOT NULL, sh2.`subid`, NULL)) AS sentCount,
COUNT(DISTINCT IF(sh2.`Open` = 1, sh2.`subid`, NULL)) AS openCount,
COUNT(DISTINCT IF(sh2.`Click` = 1, sh2.`subhistid`, NULL)) AS clickCount,
COUNT(DISTINCT IF(sh2.`Bounced` = 1, sh2.`subid`, NULL)) AS bounceCount,
COUNT(DISTINCT IF(sh2.`Unsubscribed` = 1, sh2.`subid`, NULL)) AS unsubCount,
COUNT(DISTINCT IF(sh2.`Abuse` = 1, sh2.`subid`, NULL)) AS abuseCount
FROM subscriberhistory sh
INNER JOIN subscriberhistory sh2 ON sh.subid = sh2.subid
WHERE sh.SentDate >= '#runDate# #lastRunTime#'
AND sh.dropid IS NOT NULL
AND sh.dropid != ""
OR (sh.SentDate IS NULL AND sh.OpenDate >= '#runDate# #lastRunTime#')
OR (sh.SentDate IS NULL AND sh.ClickDate >= '#runDate# #lastRunTime#')
OR (sh.SentDate IS NULL AND sh.UnsubscribeDate >= '#runDate# #lastRunTime#')
OR (sh.SentDate IS NULL AND sh.BouncedDate >= '#runDate# #lastRunTime#')
OR (sh.SentDate IS NULL AND sh.AbuseDate >= '#runDate# #lastRunTime#')
GROUP BY dropid
ORDER BY sentCount DESC

EDIT: Corrected query

SELECT sh.dropid,
COUNT(DISTINCT IF(sh2.`SentDate` IS NOT NULL, sh2.`subid`, NULL)) AS sentCount,
COUNT(DISTINCT IF(sh2.`Open` = 1, sh2.`subid`, NULL)) AS openCount,
COUNT(DISTINCT IF(sh2.`Click` = 1, sh2.`subid`, NULL)) AS clickCount,
COUNT(DISTINCT IF(sh2.`Bounced` = 1, sh2.`subid`, NULL)) AS bounceCount,  
COUNT(DISTINCT IF(sh2.`Unsubscribed` = 1, sh2.`subid`, NULL)) AS unsubCount,
COUNT(DISTINCT IF(sh2.`Abuse` = 1, sh2.`subid`, NULL)) AS abuseCount
FROM subscriberhistory sh
INNER JOIN subscriberhistory sh2 ON sh.subid = sh2.subid
WHERE sh.dropid IS NOT NULL AND sh.dropid != ""
AND ((sh.SentDate >= '2012-04-10 14:15') 
OR (sh.SentDate IS NULL AND sh.OpenDate >= '2012-04-10 14:15')
OR (sh.SentDate IS NULL AND sh.ClickDate >= '2012-04-10 14:15')
OR (sh.SentDate IS NULL AND sh.UnsubscribeDate >= '2012-04-10 14:15')
OR (sh.SentDate IS NULL AND sh.BouncedDate >= '2012-04-10 14:15')
OR (sh.SentDate IS NULL AND sh.AbuseDate >= '2012-04-10 14:15'))
GROUP BY sh.dropid
ORDER BY sentCount DESC

An example of the record set returned would look like this...

(dropid) sent opens  clicks
400      2    3      4
401      2    3      6
NULL     2    3      4

Again, the goal is to keep the NULL data out of the preceding record set. Could someone explain to me why this behavior is occurring and what to do to solve it.

Upvotes: 0

Views: 125

Answers (1)

mathematical.coffee
mathematical.coffee

Reputation: 56915

As mentioned in my comment - the NULL rows are creeping in through the OR condition in the WHERE statement.

Consider:

WHERE sh.dropid IS NOT NULL
AND   sh.dropid != ""
OR    ( some.other.condition )

Suppose you had a row with sh.dropid being NULL but still satisfying some.other.condition - then the entire WHERE clause would evaluate to TRUE because of the OR.

Solution - add brackets. e.g. (Depends on the logic you're after):

WHERE sh.dropid IS NOT NULL
  AND sh.dropid != ""
  AND (   one.condition
       OR second.condition
       OR third.condition
       )

Upvotes: 1

Related Questions