Reputation: 23
I have a Ceph cluster running with 18 X 600GB OSDs. There are three pools (size:3, pg_num:64) with an image size of 200GB on each, and there are 6 servers connected to these images via iSCSI and storing about 20 VMs on them. Here is the output of "ceph df":
POOLS:
POOL ID STORED OBJECTS USED %USED MAX AVAIL
cephfs_data 1 0 B 0 0 B 0 0 B
cephfs_metadata 2 17 KiB 22 1.5 MiB 100.00 0 B
defaults.rgw.buckets.data 3 0 B 0 0 B 0 0 B
defaults.rgw.buckets.index 4 0 B 0 0 B 0 0 B
.rgw.root 5 2.0 KiB 5 960 KiB 100.00 0 B
default.rgw.control 6 0 B 8 0 B 0 0 B
default.rgw.meta 7 393 B 2 384 KiB 100.00 0 B
default.rgw.log 8 0 B 207 0 B 0 0 B
rbd 9 150 GiB 38.46k 450 GiB 100.00 0 B
rbd3 13 270 GiB 69.24k 811 GiB 100.00 0 B
rbd2 14 150 GiB 38.52k 451 GiB 100.00 0 B
Based on this, I expect about 1.7 TB RAW capacity usage, BUT it is currently about 9TBs!
RAW STORAGE:
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 9.8 TiB 870 GiB 9.0 TiB 9.0 TiB 91.35
TOTAL 9.8 TiB 870 GiB 9.0 TiB 9.0 TiB 91.35
And the cluster is down because there is very few capacity remained. I wonder what makes this and how can I get it fixed. Your help is much appreciated
Upvotes: 0
Views: 506
Reputation: 23
The problem was mounting the iSCSI target without discard option.
Since I am using RedHat Virtualization, I just modified all storage domains created on top of Ceph, and enabled "discard" on them1. Just after a few hours, about 1 TB of storage released. Now it is about 12 hours passed and 5 TB of storage is released. Thanks
Upvotes: 1