Reputation: 1004
ClickHouse version I using: 22.11-alpine
I use Null
table engine in my project as temp table from which data will be sent to ReplicatedAggregatingMergeTree
.
This described test case indicates problem:
CREATE TABLE default.data ON CLUSTER '{cluster}'
(
id Int8,
group_by String,
value Int8
)
ENGINE = Null;
CREATE TABLE default.data_agg ON CLUSTER '{cluster}'
(
group_by String,
value AggregateFunction(sum, Int8)
)
ENGINE = ReplicatedAggregatingMergeTree
ORDER BY group_by;
CREATE MATERIALIZED VIEW default.data_agg_mv ON CLUSTER '{cluster}'
TO default.data_agg AS
SELECT
group_by,
sumState(value) as value
FROM default.data
GROUP BY group_by;
CREATE TABLE default.data_agg_dis ON CLUSTER '{cluster}'
AS default.data_agg
ENGINE = Distributed('{cluster}', default, data_agg, rand());
All that schemas are created on the cluster (ClickHouse Keeper), with 4 nodes, 4 shards, and 2 replicas.
Problem:
INSERT INTO default.data VALUES (1, 'group1', 1);
Above insert query at first-time increments value in data_agg_dis
correctly:
SELECT group_by, sumMerge(value) as value FROM default.data_agg_dis GROUP BY group_by;
Results:
group_by | value
group1 | 1
But when I try to repeat mentioned insert value for group_by: group1
, value
reaches just up to value: 4
. Feels like that when all shards touch all other aggregates are skipped.
But if I change group_by: group1
value in insert query from 1 to 2:
INSERT INTO default.data VALUES (1, 'group1', 2);
again in data_agg_dis
value increments by 2, four times and stops. This behavior is reproducible just on cluster on a single node ClickHouse server all works as espected.
Does anybody aware of same issue? it is some limitation of ClickHouse cluster? or bug?
Upvotes: 0
Views: 475
Reputation: 2450
Your ReplicatedAggregatingMergeTree
just apply deduplication by block cause you try to INSERT the same data.
Look to https://clickhouse.com/docs/en/operations/settings/settings/#settings-insert-deduplicate for details.
Upvotes: 2