sinkyminky
sinkyminky

Reputation: 83

Clickhouse: Should i optimize MergeTree table manually?

I have a table like:

create table test (id String, timestamp DateTime, somestring String) ENGINE = MergeTree ORDER BY (id, timestamp)

i inserted 100 records then inserted another 100 records and i run select query select * from test clickhouse returning with 2 parts their lengths are 100 and they are ordered in themselves. Then i run the query optimize table test and it started to return with 1 part and its length is 200 and ordered. So should i run optimize query after all insert and does it increase select query performance like select count(*) from test where id = 'foo' ?

Upvotes: 4

Views: 3081

Answers (2)

Denny Crane
Denny Crane

Reputation: 13300

Merges are eventual and may never happen. It depends on the number of inserts that happened after, the number of parts in the partition, size of parts. If the total size of input parts are greater than the maximum part size then they will never be merged.

It is very unreasonable to constantly merge up to one part. Merger does not have such goal. In the contrary the goal is to have the minimum number of parts withing smallest number of merges. Merges consume the huge amount of disk and processor resources.

It makes no sense to merge two 300GB parts into one 600GB part for 3 hours. Merger have to read, decompress 600GB, merge, compress, write them back, after that the performance of the selects will not grow at all or will grow minimally.

Upvotes: 2

Andrei Koch
Andrei Koch

Reputation: 1140

Usually not, you can rely on Clickhouse background merges.

Also, Clickhouse has no intention to merge all the data from the partition into one part file, because "over-optimization" can affect performance too

Upvotes: 0

Related Questions