How do I do a count a value based on a second distinct column in the same query?

Question

I've read through some other similar threads but can't seem to find any suitable solutions for my situation.

Here's a simplified version of the query I'm starting with:

select 
   val1, 
   val2, 
   count(distinct(val3)) 
from
(
   [select with joins that produce some duplicate values]
) as tbl1
group by val1, val2

What this effectively does is give a distinct row count for any combination of val1 and val2.

What I'm trying to do now is add another count which counts a separate value like so -

select 
   val1, 
   val2, 
   count(distinct(val3)),
   count(case when val4 = 'Y' then 1 else null end) 
from
(
   [select with joins that produce some duplicate values]
) as tbl1
group by val1, val2

The problem I am having with this above logic is that the count on val4 is counting ALL rows with a 'Y' value, not just rows with a distinct val3, which is causing it to have a much larger count than it should.

Essentially what I'm trying to achieve is roughly this logic -

count(case when val3 IS DISTINCT and val4 = 'Y' then 1 else null end)

But as far as I can see I don't see a good way to check whether or not a column is distinct within a case statement so I'm stumped here. I'm open to seeing some completely different solutions to this problem as well.

Edit: another note on this, val4 will always be the same for every val3. So there should be no conflict where there are multiple rows of val3 with different val4 values.

Gordon Linoff · Accepted Answer

If I understand correctly, you want the distinct val3s that have a 'Y'. Just use conditional aggregation:

count(distinct case when val4 = 'Y' then val3 end)

How do I do a count a value based on a second distinct column in the same query?

Answers (1)

Related Questions