Reputation: 1919
I got the above error when I was using MLUtils saveAsLibSVMFile. tried various approaches like below but nothing worked.
/*
conf.set("spark.io.compression.codec","org.apache.spark.io.LZFCompressionCodec")
*/
/*
conf.set("spark.executor.extraClassPath","/usr/hdp/current/hadoop-client/lib/snappy-java-*.jar")
conf.set("spark.driver.extraClassPath","/usr/hdp/current/hadoop-client/lib/snappy-java-*.jar")
conf.set("spark.executor.extraLibraryPath","/usr/hdp/2.3.4.0-3485/hadoop/lib/native")
conf.set("spark.driver.extraLibraryPath","/usr/hdp/2.3.4.0-3485/hadoop/lib/native")
*/
I read the following links https://community.hortonworks.com/questions/18903/this-version-of-libhadoop-was-built-without-snappy.html
Finally there were only two ways I could solve it. This is given in answer below.
Upvotes: 3
Views: 8971
Reputation: 1919
One approach was to use a different hadoop codec like below
sc.hadoopConfiguration.set("mapreduce.output.fileoutputformat.compress", "true")
sc.hadoopConfiguration.set("mapreduce.output.fileoutputformat.compress.type", CompressionType.BLOCK.toString)
sc.hadoopConfiguration.set("mapreduce.output.fileoutputformat.compress.codec", "org.apache.hadoop.io.compress.BZip2Codec")
sc.hadoopConfiguration.set("mapreduce.map.output.compress", "true")
sc.hadoopConfiguration.set("mapreduce.map.output.compress.codec", "org.apache.hadoop.io.compress.BZip2Codec")
Second approach was to mention --driver-library-path /usr/hdp/<whatever is your current version>/hadoop/lib/native/
as a parameter to my spark-submit job (in command line)
Upvotes: 2