isaac.hazan
isaac.hazan

Reputation: 3874

Redshift COPY automatic compression

I am unclear on how the automatic compression works when using the COPY command with Redshift.

The documentation says:

By default, the COPY command applies automatic compression whenever you run the COPY command with an empty target table and all of the table columns either have RAW encoding or no encoding.

Does this mean that for my main table where the raw data is copied on an ongoing basis, the data will be compressed only the first time a COPY will occur to this table and never again for subsequent times? Seems like i misunderstand something cause that doesn't make sense it would work this way.

Thx

Upvotes: 1

Views: 781

Answers (2)

pcothenet
pcothenet

Reputation: 401

I confirm Masashi's answer. Note however that:

Automatic compression analysis requires enough rows in the load data (at > least 100,000 rows per slice) to allow sampling to take place.

If you run COPY on a small batch, your table will be set to no encoding. And all the subsequent COPY calls won't change that. You can solve that later by running a deep copy of your table.

Upvotes: 0

Masashi M
Masashi M

Reputation: 2757

Basically an encoding(compression) type needs to be set for each column when creating a table. However there is an exception, as you quoted from AWS docs, when data is copied into an empty table, Redshift automatically analyzes and sets a best encoding to all columns along with copied data. Then subsequent data will be compressed with the set encoding.

Therefore, the answer for your questions is "No". Once encoding(compression) is set through either way, subsequent items will be compressed.

Upvotes: 4

Related Questions