Sushil Ks
Sushil Ks

Reputation: 403

How to enable snappy compression for all the loaded data in hive?

I have around TB's of data in my Hive warehouse, am trying to enable snappy compression for them. I know that we can enable hive compression using

hive> SET hive.exec.compress.output=true;
hive> SET mapred.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec;

while loading the data into hive, But how do i compress the data which is already loaded.

Upvotes: 1

Views: 5403

Answers (1)

Sandy
Sandy

Reputation: 279

Hive ORCFile supports compressed storage. To convert existing data to ORCFile, create a new table with the same schema as the source table plus stored as orc, See below:-

CREATE TABLE A_ORC ( 
    customerID int, name string, ..etc 
) STORED AS ORC tblproperties (“orc.compress" = “SNAPPY”); 

INSERT INTO A_ORC SELECT * FROM A; 

Here A_ORC is the new table and A is source table

Here you can learn more about ORCFile.

Upvotes: 1

Related Questions