Tony Vincent
Tony Vincent

Reputation: 14372

Strange behaviour when using FILTER clause

I have two queries

SELECT COUNT(DISTINCT keyword_id),
  date_trunc('month', rank_date) AS date
  FROM keyword_ranks, keywords
  WHERE keywords.deleted_at IS NULL
  AND keywords.id=keyword_ranks.keyword_id 
  AND keywords.business_id=27 GROUP BY date_trunc('month', rank_date);

The result-set is

enter image description here

Now I have added some FILTER clauses like

SELECT COUNT(DISTINCT keyword_id) FILTER (WHERE rank>50 OR rank is null) AS "50+",
  COUNT(DISTINCT keyword_id) FILTER (WHERE rank BETWEEN 21 AND 50) AS "21-50",
  COUNT(DISTINCT keyword_id) FILTER (WHERE rank BETWEEN 11 AND 20) AS "11-20",
  COUNT(DISTINCT keyword_id) FILTER (WHERE rank BETWEEN 4 AND 10) AS "4-10",
  COUNT(DISTINCT keyword_id) FILTER (WHERE rank BETWEEN 1 AND 3) AS "1-3",
  date_trunc('month', rank_date) AS date
  FROM keyword_ranks, keywords
  WHERE keywords.deleted_at IS NULL
  AND keywords.id=keyword_ranks.keyword_id 
  AND keywords.business_id=27 GROUP BY date_trunc('month', rank_date);

For which the result-set is

enter image description here

As you can see from the first query result, count of distinct keyword_id on every month is 147. But why is the count when using FILTER clauses add up more than 147 ? If this is not the way of obtaining distinct counts , how to go about this. Please bear with my newbie question. Any help much appreciated

Upvotes: 0

Views: 41

Answers (1)

Clodoaldo Neto
Clodoaldo Neto

Reputation: 125544

The ids are distinct within a rank but not within the total. Check this example:

with v (id, rank) as (values (1,1),(1,10))
select
    count(distinct id) as total,
    count(distinct id) filter (where rank < 10) as "< 10",
    count(distinct id) filter (where rank >= 10) as ">= 10"
from v
;
 total | < 10 | >= 10 
-------+------+-------
     1 |    1 |     1

The only way to make the sum of the rankings match the total is not to use distinct.

with v (id, rank) as (values (1,1),(1,10))
select
    count(id) as total,
    count(id) filter (where rank < 10) as "< 10",
    count(id) filter (where rank >= 10) as ">= 10"
from v
;
 total | < 10 | >= 10 
-------+------+-------
     2 |    1 |     1

Upvotes: 1

Related Questions