Hibari
Hibari

Reputation: 131

Redshift Cluster Resize [Insufficient Disk Space]

We currently have a Redshift cluster with a ds2.xlarge and we would like to resize to a dc2.large but we have problems doing so. We get an error for it saying that disk space is insufficient.

Currently, we have about 720GB of data and we are trying to resize it to a dc2 with 5 nodes (800gb).

I'm not sure why we are getting this? Do you have any ideas?

Upvotes: 0

Views: 2927

Answers (1)

Bill Weiner
Bill Weiner

Reputation: 11032

Hibari - Packing 720GB of data into a Redshift cluster with max storage of 800GB is not advisable. This is for several reasons:

  1. Disk space is needed as scratch space for performing queries and other data operations (vacuum)
  2. Scratch data is not compressed when it is stored on disk
  3. Data load operations, such as COPY, need space to store incoming data
  4. Redshift is based on mulit-version coherency so many additional blocks need to be kept so that transactions have access to the correct data

You transition between node types will impact how data is stored on disk - you are moving from 1 (?) node with 16 slices to 5 nodes with 10 slices. If you have distribution ALL tables these will be stored 5 times, not once. The size of the database on this new cluster may not be the same - it could be smaller or larger.

I expect that real limit is in organizing the data on the new cluster as it comes in from the old cluster. As I mentioned data in flight is uncompressed at it is likely this working set that is causing the issue.

You need a bigger cluster for this much data.

Upvotes: 1

Related Questions