crazyGo Li
crazyGo Li

Reputation: 11

Why the Maximum Metadata Size of a Thin Logical Volume Is Only 15.81G?


My boss asked me to create a thin pool with metadata size exceeding the maximum limit.

In the first step, I executed the subordinate command:

[root@localhost ~]# pvs /dev/vdb
  Configuration setting "devices/allow_mixed_block_sizes" unknown.
  PV         VG Fmt  Attr PSize  PFree
  /dev/vdb   nm lvm2 a--  <1.82t <1.82t
[root@localhost ~]# lvcreate -Zn --errorwhenfull y -l 100%FREE --poolmetadatasize 20g --chunksize 1m --thinpool pool0 nm
  Configuration setting "devices/allow_mixed_block_sizes" unknown.
  Thin pool volume with chunk size 1.00 MiB can address at most 253.00 TiB of data.
  WARNING: Maximum supported pool metadata size is 15.81 GiB.
  Logical volume "pool0" created.
[root@localhost ~]#

The alarm information above is printed by LVM2, so I blocked the limit on metadata size in lvm2, like this: file:lib/metadata/thin_manip.c

if (pool_metadata_size > (2 * DEFAULT_THIN_POOL_MAX_METADATA_SIZE)) {
    log_warn("WARNING: boss is a fool.");
    // pool_metadata_size = 2 * DEFAULT_THIN_POOL_MAX_METADATA_SIZE;
    // if (*pool_metadata_extents)
    //      log_warn("WARNING: Maximum supported pool metadata size is %s.",
    //               display_size(cmd, pool_metadata_size));
}

But I found that lvm2's metadata size limit is based on the kernel limit.

The following are the relevant restrictions in the kernel(drivers/md/persistent-data/dm-space-map-metadata.h):

     /*
      * The metadata device is currently limited in size.
      *
      * We have one block of index, which can hold 255 index entries.  Each
      * index entry contains allocation info about ~16k metadata blocks.
      */
     #define DM_SM_METADATA_MAX_BLOCKS (255 * ((1 << 14) - 64))
     #define DM_SM_METADATA_MAX_SECTORS (DM_SM_METADATA_MAX_BLOCKS * DM_SM_METADATA_BLOCK_SIZE)

so, I modified the macro definition of DM_SM_METADATA_MAX_BLOCKS:

#define DM_SM_METADATA_MAX_BLOCKS (2 * 255 * ((1 << 14) - 64))

But I got an error when creating.

[root@localhost md]# lvcreate -Zn --errorwhenfull y -l 100%FREE --poolmetadatasize 20g --chunksize 1
m --thinpool pool0 nm
  Configuration setting "allocation/thin_pool_crop_metadata" unknown.
  Configuration setting "devices/allow_mixed_block_sizes" unknown.
  Thin pool volume with chunk size 1.00 MiB can address at most 253.00 TiB of data.
  WARNING: boss is a fool.
  device-mapper: reload ioctl on  (253:4) failed: Invalid argument
  Failed to activate new LV.
  Internal error: Removing still active LV nm/pool0_tmeta.
  Manual intervention may be required to remove abandoned LV(s) before retrying.
Removal of pool metadata spare logical volume nm/lvol0_pmspare disables automatic recovery attempts after damage to a thin or cache pool. Proceed? [y/n]: n
  Logical volume nm/lvol0_pmspare not removed.
[root@localhost md]# lvs
  Configuration setting "allocation/thin_pool_crop_metadata" unknown.
  Configuration setting "devices/allow_mixed_block_sizes" unknown.
  LV    VG Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  home  cs -wi-ao---- <52.06g
  root  cs -wi-ao----  70.00g
  swap  cs -wi-ao----  <4.94g
  pool0 nm twi---t---   1.78t
[root@localhost md]# lvs -a
  Configuration setting "allocation/thin_pool_crop_metadata" unknown.
  Configuration setting "devices/allow_mixed_block_sizes" unknown.
  LV              VG Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  home            cs -wi-ao---- <52.06g
  root            cs -wi-ao----  70.00g
  swap            cs -wi-ao----  <4.94g
  [lvol0_pmspare] nm ewi-------  20.00g
  pool0           nm twi---t---   1.78t
  [pool0_tdata]   nm Twi-a-----   1.78t
  [pool0_tmeta]   nm ewi-a-----  20.00g
[root@localhost md]# lsblk /dev/
lsblk: /dev/: not a block device
[root@localhost md]# lsblk
NAME             MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sr0               11:0    1 1024M  0 rom
vda              252:0    0  128G  0 disk
├─vda1           252:1    0    1G  0 part /boot
└─vda2           252:2    0  127G  0 part
  ├─cs-root      253:0    0   70G  0 lvm  /
  ├─cs-swap      253:1    0    5G  0 lvm  [SWAP]
  └─cs-home      253:5    0 52.1G  0 lvm  /home
vdb              252:16   0  1.8T  0 disk
├─nm-pool0_tmeta 253:2    0 15.8G  0 lvm
└─nm-pool0_tdata 253:3    0  1.8T  0 lvm

The following is the log about the kernel:

Jul 20 22:40:49 localhost kernel: dm_persistent_data: loading out-of-tree module taints kernel.
Jul 20 22:40:49 localhost kernel: dm_persistent_data: module verification failed: signature and/or required key missing - tainting kernel
Jul 20 22:41:11 localhost kernel: device-mapper: space map common: space map too large
Jul 20 22:41:11 localhost kernel: device-mapper: transaction manager: couldn't create metadata space map
Jul 20 22:41:11 localhost kernel: device-mapper: thin metadata: tm_create_with_sm failed
Jul 20 22:41:11 localhost kernel: device-mapper: table: 253:4: thin-pool: Error creating metadata object
Jul 20 22:41:11 localhost kernel: device-mapper: ioctl: error adding target to table
Jul 20 22:42:38 localhost kernel: device-mapper: space map common: space map too large
Jul 20 22:42:38 localhost kernel: device-mapper: transaction manager: couldn't create metadata space map
Jul 20 22:42:38 localhost kernel: device-mapper: thin metadata: tm_create_with_sm failed
Jul 20 22:42:38 localhost kernel: device-mapper: table: 253:4: thin-pool: Error creating metadata object
Jul 20 22:42:38 localhost kernel: device-mapper: ioctl: error adding target to table

So I want to know the following information:

  1. How is the result of 15.81G calculated? Is it based on the node upper limit of btree?
  2. What is the cost of breaking through this maximum value? Or is it a good idea to break this maximum?

Upvotes: 1

Views: 442

Answers (0)

Related Questions