Davood Ghatreh
Davood Ghatreh

Reputation: 23

Ceph OSDs are full, but I have not stored that much of data

I have a Ceph cluster running with 18 X 600GB OSDs. There are three pools (size:3, pg_num:64) with an image size of 200GB on each, and there are 6 servers connected to these images via iSCSI and storing about 20 VMs on them. Here is the output of "ceph df":

POOLS:

POOL                           ID     STORED      OBJECTS     USED        %USED      MAX AVAIL 
cephfs_data                     1         0 B           0         0 B          0           0 B 
cephfs_metadata                 2      17 KiB          22     1.5 MiB     100.00           0 B 
defaults.rgw.buckets.data       3         0 B           0         0 B          0           0 B 
defaults.rgw.buckets.index      4         0 B           0         0 B          0           0 B 
.rgw.root                       5     2.0 KiB           5     960 KiB     100.00           0 B 
default.rgw.control             6         0 B           8         0 B          0           0 B 
default.rgw.meta                7       393 B           2     384 KiB     100.00           0 B 
default.rgw.log                 8         0 B         207         0 B          0           0 B 
rbd                             9     150 GiB      38.46k     450 GiB     100.00           0 B 
rbd3                           13     270 GiB      69.24k     811 GiB     100.00           0 B 
rbd2                           14     150 GiB      38.52k     451 GiB     100.00           0 B 

Based on this, I expect about 1.7 TB RAW capacity usage, BUT it is currently about 9TBs!

RAW STORAGE:

CLASS     SIZE        AVAIL       USED        RAW USED     %RAW USED 
hdd       9.8 TiB     870 GiB     9.0 TiB      9.0 TiB         91.35 
TOTAL     9.8 TiB     870 GiB     9.0 TiB      9.0 TiB         91.35 

And the cluster is down because there is very few capacity remained. I wonder what makes this and how can I get it fixed. Your help is much appreciated

Upvotes: 0

Views: 506

Answers (1)

Davood Ghatreh
Davood Ghatreh

Reputation: 23

The problem was mounting the iSCSI target without discard option.

Since I am using RedHat Virtualization, I just modified all storage domains created on top of Ceph, and enabled "discard" on them1. Just after a few hours, about 1 TB of storage released. Now it is about 12 hours passed and 5 TB of storage is released. Thanks

Upvotes: 1

Related Questions