Scott McKeown
Scott McKeown

Reputation: 141

Ceph storage OSD disk upgrade (replace with larger drive)

I have three servers each with 1 x SSD drive (Ceph base OS) and 6 x 300Gb SAS drives, at the moment I'm only using 4 drives on each server as the OSD's in my Ceph storage array and everything is fine. My question is that now I have built this and got everything up and running if say in 6 months or so I need to replace these OSD's due to the space of the storage array running out is it possible to remove one disk at a time from each server and replace it with a large drive?

For example if server 1 had OSD 0-5, server 2 has OSD 6-11 and server 3 has OSD 12-17 could I one day remove OSD0 and replace it with a 600Gb SAS drive, wait for it to heal the do the same with OSD6 then OSD12 etc. etc. until all the disks are replaced, and would this then give me a large storage pool?

Upvotes: 2

Views: 3067

Answers (2)

ImmuC
ImmuC

Reputation: 1

If you have a lot of disks in your servers, and you want to upgrade all your disks, I believe that it is also possible to drain 1 host at a time:

  1. Select a host and drain it (GUI-->Cluster-->Hosts-->Select Host-->Start drain)
  2. Wait for the drain to finish
  3. Shutdown the host (or not if the disks are hotplugable)
  4. Replace all disks of the host with the bigger ones.
  5. Remove the _no-schedule tag from the host and let ceph recreate the services
  6. Let ceph recreate the OSDs (or create them yourself if necessary)
  7. Wait for the cluster to be in a healthy state again.
  8. Repeat with the other hosts.

Upvotes: 0

Scott McKeown
Scott McKeown

Reputation: 141

OK just for anyone that is looking for this answer in the future you can upgrade your drives in the way that I mention above and here are the steps that I have taken (please note that this is in a lab and not production)

  1. Mark the OSD as down
  2. Mark the OSD as Out
  3. Remove the drive in question
  4. Install new drive (must be either the same size or larger)
  5. I needed to reboot the server in question for the new disk to be seen by the OS
  6. Add the new disk into Ceph as normal
  7. Wait for the cluster to heal then repeat on a different server

I have now done this with 6 out of my 15 drives over 3 servers and each time the size of the Ceph storage has increase a little (I'm only doing 320G drives to 400Gb drives as this is only a test and I have some of these not in use).

I plan on starting this on the live production servers next week now that I know it works and going from 300G to 600G drives I should see a larger increase in storage (I hope).

Upvotes: 2

Related Questions