J.A.R.V.I.S
J.A.R.V.I.S

Reputation: 23

What's the process of ClickHouse primary index

As mentioned in the title, i am very confused about the ClickHouse primary index. ClickHouse primary index used these files: primary.idx, [primaryField].mrk, [primaryField].bin. Where is MarkRange stored and how it works? How do these relate to each other?

Upvotes: 1

Views: 928

Answers (1)

Denny Crane
Denny Crane

Reputation: 13300

primary.idx -- contains primary columns keys values

20.13.1.5273
create table X(A Int64, S String) 
Engine=MergeTree order by A settings index_granularity=4096, min_bytes_for_wide_part=0;
insert into X select number, toString(number) from numbers(32768);

cd /var/lib/clickhouse/data/default/X/all_1_1_0/

order by A -- A Int64  -- primary.idx has A value with size 8 bytes one after another, 
default granularity is 4096
primary.idx === 0, 4096, 8192 

check 

od -l -j 0 -N 8 primary.idx    #skip zero bytes read 8
0000000                    0

od -l -j 8 -N 8 primary.idx    #skip 8 bytes read 8
0000010                 4096

od -l -j 16 -N 8 primary.idx   #skip 16 bytes read 8
0000020                 8192      0


mrk files contains offsets to bin (column) from primary key

od -l -j 0 -N 24 S.mrk2
0000000                    0                    0
0000020                 4096

0  offset in compressed file (S.bin)
0  offset in decompressed block
4096 number of rows in granula

od -l -j 48 -N 24 S.mrk2
0000060                    0                39850
0000100                 4096

0  offset in compressed file (S.bin)
39850  offset in decompressed block
4096 number of rows in granula

od -l -j 72 -N 24 S.mrk2
0000110                    0                62618
0000130                 4096

0  offset in compressed file (S.bin)
62618  offset in decompressed block
4096 number of rows in granula


4096+4096+4096 = 12288 -- third granula of S column must contain strings with 12288+

check
dd status=none bs=1 skip=0 if=S.bin|clickhouse-compressor -d|dd status=none bs=1 skip=62618 count=50|hexdump -C
00000000  05 31 32 32 38 38 05 31  32 32 38 39 05 31 32 32  |.12288.12289.122|
00000010  39 30 05 31 32 32 39 31  05 31 32 32 39 32 05 31  |90.12291.12292.1|
00000020  32 32 39 33 05 31 32 32  39 34 05 31 32 32 39 35  |2293.12294.12295|
00000030  05 31                                             |.1|
00000032

some pictures in russian https://raw.githubusercontent.com/clickhouse/clickhouse-presentations/master/meetup27/adaptive_index_granularity.pdf

Upvotes: 2

Related Questions