Andrew
Andrew

Reputation: 23

include "0" count result in clickhouse

I have one t_table(some_date, id), and i want get count of ids for each day between two input dates.

I try this query:

SELECT dateDiff('day', toDateTime('2020-07-01 07:34:22'), some_date) as day,
       count(distinct id) as ids,
from t_table
where (some_date between '2020-07-01 00:09:25' and '2020-07-15 21:09:25') group by day order by day;

This query returns day only if id has (some_date) for this day. I want get column of all days between input dates, and if no id in some day in ids column put 0 to that cell.

I think that needed to use some join, but i don't know what join with.

Upvotes: 1

Views: 611

Answers (3)

Denny Crane
Denny Crane

Reputation: 13310

order by ... with fill

SELECT dateDiff('day', toDateTime('2020-07-01 07:34:22'), some_date) as day,
       count() as ids
from (select toDateTime(arrayJoin(['2020-07-02 00:09:25','2020-07-02 00:10:25','2020-07-15 00:00:25'])) some_date)
where (some_date between '2020-07-01 00:09:25' and '2020-07-15 21:09:25') group by day order by day
┌─day─┬─ids─┐
│   1 │   2 │
│  14 │   1 │
└─────┴─────┘


SELECT dateDiff('day', toDateTime('2020-07-01 07:34:22'), some_date) as day,
       count() as ids
from (select toDateTime(arrayJoin(['2020-07-02 00:09:25','2020-07-02 00:10:25','2020-07-15 00:00:25'])) some_date)
where (some_date between '2020-07-01 00:09:25' and '2020-07-15 21:09:25') group by day order by day
with fill;
┌─day─┬─ids─┐
│   1 │   2 │
│   2 │   0 │
│   3 │   0 │
│   4 │   0 │
│   5 │   0 │
│   6 │   0 │
│   7 │   0 │
│   8 │   0 │
│   9 │   0 │
│  10 │   0 │
│  11 │   0 │
│  12 │   0 │
│  13 │   0 │
│  14 │   1 │
└─────┴─────┘

Upvotes: 4

Gordon Linoff
Gordon Linoff

Reputation: 1269973

You can use the numbers function to generate numbers and then convert them to dates. If I follow the logic, the rest of the query follows as a left join with aggregation:

with dates as (
      select toDateTime('2020-07-01 00:09:25') + n as dt, number as n
      from numbers(1, 15)
     )
select d.n as day, count(distinct t.id) as ids,
from dates d left join
     t_table t
     on t.some_date >= d.dt and
        t.some_date < d.dt + 1
group by day
order by day;

I'm not 100% sure if Clickhouse accepts those conditions on a LEFT JOIN. You might have to use something like this:

on toDate(some_date) = to_date(d.dt)

Upvotes: 0

vladimir
vladimir

Reputation: 15216

Try this query:

SELECT
    greatest(day, date_range.day) AS day,
    ids
FROM
(
    SELECT *
    FROM
    (
        SELECT
            dateDiff('day', toDateTime('2020-07-01 07:34:22'), some_date) AS day,
            countDistinct(id) AS ids
        FROM t_table
        WHERE (some_date >= '2020-07-01 00:09:25') AND (some_date <= '2020-07-15 21:09:25')
        GROUP BY day
    ) AS result
    FULL OUTER JOIN
    (
        SELECT toDate('2020-07-01 00:09:25') + number AS day
        FROM numbers(0, 15)
    ) AS date_range ON result.day = date_range.day
)
ORDER BY date ASC

Upvotes: 1

Related Questions