Coinnigh
Coinnigh

Reputation: 619

Is the FileOutputFormat.setCompressOutput(job, true); optional?

In Hadoop program, I tried to compress the result, I wrote the following code:

FileOutputFormat.setCompressOutput(job, true); 
FileOutputFormat.setOutputCompressorClass(job, GzipCodec.class);

The result was compressed, and when I delete the first line:

FileOutputFormat.setCompressOutput(job, true); 

and execute the program again, the result was same, was the above code

FileOutputFormat.setCompressOutput(job, true);

optional? What is the function of that code?

Upvotes: 1

Views: 140

Answers (1)

Ram Ghadiyaram
Ram Ghadiyaram

Reputation: 29195

Please see the below methods in FileOutPutFormat.java which internally calls the method call which you have deleted.

i.e setCompressOutput(conf, true);

That means you are trying apply Gzip codec class then obviously its a pointer to code that output should be compressed. Isnt it ?

/**
   * Set whether the output of the job is compressed.
   * @param conf the {@link JobConf} to modify
   * @param compress should the output of the job be compressed?
   */
  public static void setCompressOutput(JobConf conf, boolean compress) {
    conf.setBoolean("mapred.output.compress", compress);
  }
  /**
   * Set the {@link CompressionCodec} to be used to compress job outputs.
   * @param conf the {@link JobConf} to modify
   * @param codecClass the {@link CompressionCodec} to be used to
   *                   compress the job outputs
   */
  public static void 
  setOutputCompressorClass(JobConf conf, 
                           Class<? extends CompressionCodec> codecClass) {
    setCompressOutput(conf, true);
    conf.setClass("mapred.output.compression.codec", codecClass, 
                  CompressionCodec.class);
  }

Upvotes: 1

Related Questions