derekhh
derekhh

Reputation: 5532

COUNT DISTINCT with CONDITIONS

I want to count the number of distinct items in a column subject to a certain condition, for example if the table is like this:

tag | entryID
----+---------
foo | 0
foo | 0
bar | 3

If I want to count the number of distinct tags as "tag count" and count the number of distinct tags with entry id > 0 as "positive tag count" in the same table, what should I do?

I'm now counting from two different tables where in the second table I've only selected those rows with entryID larger than zero. I think there should be a more compact way to solve this problem.

Upvotes: 164

Views: 532643

Answers (7)

Echsecutor
Echsecutor

Reputation: 854

If you are using an SQL dialect which supports FILTER on counts, such as Postgresql, you can write ntalbs's answer in the slightly more readable form

SELECT
  COUNT(DISTINCT tag) AS tag_count,
  COUNT(DISTINCT tag) FILTER (WHERE entryID > 0) AS positive_tag_count
FROM
  your_table_name;

This generalizes to all aggregate functions.

Upvotes: 0

Ukpa Uchechi
Ukpa Uchechi

Reputation: 730

I agree with @ntalbs solution, if you want to count a column's data when the condition of another column's data is valid, you can do this

select
  count(distinct tag) as tag_count,
  count(distinct tag, case when entryId > 0 then tag end) as positive_tag_count
from
  your_table_name;

On line 3, I added the column name beside the distinct, so it will count the distinct tags when the entryId is greater than 0

Upvotes: 2

Abhishek Gupta
Abhishek Gupta

Reputation: 21

Code counts the unique/distinct combination of Tag & Entry ID when [Entry Id]>0

select count(distinct(concat(tag,entryId)))
from customers
where id>0

In the output it will display the count of unique values Hope this helps

Upvotes: 2

ntalbs
ntalbs

Reputation: 29438

You can try this:

select
  count(distinct tag) as tag_count,
  count(distinct (case when entryId > 0 then tag end)) as positive_tag_count
from
  your_table_name;

The first count(distinct...) is easy. The second one, looks somewhat complex, is actually the same as the first one, except that you use case...when clause. In the case...when clause, you filter only positive values. Zeros or negative values would be evaluated as null and won't be included in count.

One thing to note here is that this can be done by reading the table once. When it seems that you have to read the same table twice or more, it can actually be done by reading once, in most of the time. As a result, it will finish the task a lot faster with less I/O.

Upvotes: 383

MJBLACKEND
MJBLACKEND

Reputation: 121

Try the following statement:

select  distinct A.[Tag],
     count(A.[Tag]) as TAG_COUNT,
     (SELECT count(*) FROM [TagTbl] AS B WHERE A.[Tag]=B.[Tag] AND B.[ID]>0)
     from [TagTbl] AS A GROUP BY A.[Tag]

The first field will be the tag the second will be the whole count the third will be the positive ones count.

Upvotes: 2

BrianC
BrianC

Reputation: 314

This may also work:

SELECT 
    COUNT(DISTINCT T.tag) as DistinctTag,
    COUNT(DISTINCT T2.tag) as DistinctPositiveTag
FROM Table T
    LEFT JOIN Table T2 ON T.tag = T2.tag AND T.entryID = T2.entryID AND T2.entryID > 0

You need the entryID condition in the left join rather than in a where clause in order to make sure that any items that only have a entryID of 0 get properly counted in the first DISTINCT.

Upvotes: -1

zaz
zaz

Reputation: 505

This may work:

SELECT Count(tag) AS 'Tag Count'
FROM Table
GROUP BY tag

and

SELECT Count(tag) AS 'Negative Tag Count'
FROM Table
WHERE entryID > 0
GROUP BY tag

Upvotes: 2

Related Questions