Donatas Veikutis
Donatas Veikutis

Reputation: 1004

Clickhouse Null table engine not incrementing AggregatingMergeTree correctly on cluster

ClickHouse version I using: 22.11-alpine

I use Null table engine in my project as temp table from which data will be sent to ReplicatedAggregatingMergeTree.

This described test case indicates problem:

CREATE TABLE default.data ON CLUSTER '{cluster}'
(
    id Int8,
    group_by String,
    value Int8
)
ENGINE = Null;

CREATE TABLE default.data_agg ON CLUSTER '{cluster}'
(
    group_by String,
    value AggregateFunction(sum, Int8)
)
ENGINE = ReplicatedAggregatingMergeTree
ORDER BY group_by;

CREATE MATERIALIZED VIEW default.data_agg_mv ON CLUSTER '{cluster}'
TO default.data_agg AS
SELECT
    group_by,
    sumState(value) as value
FROM default.data
GROUP BY group_by;

CREATE TABLE default.data_agg_dis ON CLUSTER '{cluster}'
AS default.data_agg
ENGINE = Distributed('{cluster}', default, data_agg, rand());

All that schemas are created on the cluster (ClickHouse Keeper), with 4 nodes, 4 shards, and 2 replicas.

Problem:

INSERT INTO default.data VALUES (1, 'group1', 1);

Above insert query at first-time increments value in data_agg_dis correctly:

SELECT group_by, sumMerge(value) as value FROM default.data_agg_dis GROUP BY group_by;

Results:
group_by | value
group1   |   1

But when I try to repeat mentioned insert value for group_by: group1, value reaches just up to value: 4. Feels like that when all shards touch all other aggregates are skipped.

But if I change group_by: group1 value in insert query from 1 to 2:

    INSERT INTO default.data VALUES (1, 'group1', 2);

again in data_agg_dis value increments by 2, four times and stops. This behavior is reproducible just on cluster on a single node ClickHouse server all works as espected.

Does anybody aware of same issue? it is some limitation of ClickHouse cluster? or bug?

Upvotes: 0

Views: 475

Answers (1)

Slach
Slach

Reputation: 2450

Your ReplicatedAggregatingMergeTree just apply deduplication by block cause you try to INSERT the same data.

Look to https://clickhouse.com/docs/en/operations/settings/settings/#settings-insert-deduplicate for details.

Upvotes: 2

Related Questions