user3364545
user3364545

Reputation: 133

Apache Spark -Unable to load native-hadoop library for your platform... using builtin-java classes where applicable" and terminate the execution

I am using Apache Spark on Windows 10 64 bit machine. I have installed Java, Python 3.6 ,spark-2.3.1-bin-hadoop2.7. I am using VSCode editor for PySpark codeing.

When I'm executing the Python spark code in VSCode using spark-submit, it is showing

Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

and is terminating the execution.

Relevant code:

from pyspark import SparkContext, SparkConf 
if name == "main": 
    conf = SparkConf().setAppName("word count").setMaster("local[2]") 
    sc = SparkContext(conf=conf) 
    lines = sc.textFile("in/word_count.text") 
    words = lines.flatMap(lambda line: line.split(" ")) 
    wordcounts = words.countByValue() 
    for word, count in wordcounts.items(): 
        print("{} : {}".format(word,count))

Spark Execution Error:

Spark Execution Error

Upvotes: 0

Views: 3976

Answers (1)

Nauman Naeem
Nauman Naeem

Reputation: 408

You can safely ignore the warning as it is not the reason behind your shutdown call. According to documentation:

The native hadoop library is supported on *nix platforms only. The library does not to work with Cygwin or the Mac OS X platform.

The native hadoop library is mainly used on the GNU/Linus platform and has been tested on these distributions:

RHEL4/Fedora Ubuntu Gentoo On all the above distributions a 32/64 bit native hadoop library will work with a respective 32/64 bit jvm.

Upvotes: 3

Related Questions