fdkgfosfskjdlsjdlkfsf
fdkgfosfskjdlsjdlkfsf

Reputation: 3301

Left join returning count of 1 even though no rows exist for that count?

I have a query that looks like the following. It uses a cte:

SELECT  d.hour, 
        hourkey, 
        range, 
        COUNT(*) as 'count'
FROM dimhour d 
    LEFT JOIN cte2 
        ON d.hour = cte2.hourkey
        AND range IS NOT NULL
WHERE d.hour <= 23
GROUP BY d.hour, 
         hourkey, 
         range                
ORDER BY d.hour DESC

This is the result of this query:

hour   hourkey  range    count
18     NULL     NULL     1
17     NULL     NULL     1
16     NULL     NULL     1
15     15       99%      15
14     14       99%      15
13     13       99%      15
12     12       99%      15
11     11       99%      15
10     10       99%      15

The result of cte2 is too large to post here, but I can tell you that the hours where cte2.hourkey is null do not have a single row within cte2. There's nothing after Hr15 that could possibly return a positive number for count(*). But count(*) somehow returns 1 for those.

Why is this query returning a count(*) of 1 for non-existing hours and how can I remove them?

Upvotes: 2

Views: 1950

Answers (3)

user743382
user743382

Reputation:

To answer from a different perspective, in a way I think may be easier to understand:

If you take out the GROUP BY, you get:

hour   hourkey  range
18     NULL     NULL
17     NULL     NULL
16     NULL     NULL
15     15       99%
15     15       99%
15     15       99%
15     15       99%
15     15       99%
15     15       99%
15     15       99%
15     15       99%
15     15       99%
15     15       99%
15     15       99%
15     15       99%
15     15       99%
15     15       99%
15     15       99%
14     14       99%
...

If you now group by hour, hourkey, range, how many rows do you see for 18, NULL, NULL? I spot one row. It's right there at the top. That's what COUNT(*) is returning.

And that's how to approach this too: the count you want, is that a count you can easily obtain from this result set? If not, then look for another approach. In your case, you appear to want to do the grouping on cte2. So write that instead!

select d.hour, sub.hourkey, sub.range, sub."count"
from dimhour d
left join (
    select cte2.hourkey, cte2.range, count(*) as "count"
    from cte2
    where cte2.range is not null
    group by cte2.hourkey, cte2.range
) as sub
on d.hour = sub.hourkey
where d.hour <= 23
order by d.hour desc

One difference is that now instead of 0, you'll get NULL, but you could use ISNULL(sub."count", 0) to avoid that if it bothers you.

Upvotes: 0

Kateract
Kateract

Reputation: 852

COUNT(*) will return a count of all rows in a particular group, including the row it's on. If you want it to count a particular column, you need to specify the column, such as COUNT(cte2.hourkey). This will count non-null records within the group.

SELECT d.hour, hourkey, range, COUNT(cte2.hourkey) AS [count]
FROM dimhour d LEFT JOIN cte2 ON
d.hour = cte2.hourkey
AND range IS NOT NULL
WHERE d.hour <= 23
GROUP BY d.hour, hourkey, range                
ORDER BY d.hour desc

If you don't want the rows to show at all, you need an INNER JOIN instead of a LEFT JOIN:

SELECT d.hour, hourkey, range, COUNT(cte2.hourkey) AS [count]
FROM dimhour d INNER JOIN cte2 ON
d.hour = cte2.hourkey
AND range IS NOT NULL
WHERE d.hour <= 23
GROUP BY d.hour, hourkey, range                
ORDER BY d.hour desc

Upvotes: 3

Gordon Linoff
Gordon Linoff

Reputation: 1270391

The expression:

count(*) as [count]

returns the number of rows in the result set. LEFT JOIN is guaranteeing at least one row, even if it doesn't match.

To count the matches, then count one of the columns used for the JOIN (or the primary key):

count(cte2.hourkey) as [count]

Note: Only use single quotes for string and date constants. Using them for columns is likely to cause problems in the future.

Upvotes: 2

Related Questions