simon_dmorias
simon_dmorias

Reputation: 2473

WinUtils & Spark 3.1.1 Failures

I'm trying to create a Windows 10 developer vm with a Conda environment and PySpark but seeing constant problems with getting Spark & winutils to work.

Environment:

I have created C:\Hadoop\bin and downloaded winutils from here https://github.com/cdarlint/winutils/tree/master/hadoop-3.2.1/bin (I've also tried 3.2.0).

HADOOP_HOME is C:\Hadoop and Path has %HADOOP_HOME\bin in it. JAVA_HOME is correct.

This code works:

location = 'C:/myfiles/file.csv'
df = spark.read.format("csv").options(header=True).load(location)

This code fails:

location = 'C:/myfiles/'
df = spark.read.format("csv").options(header=True).load(location)

Error message:

An error occurred while calling o35.load.
: java.lang.UnsatisfiedLinkError: 'boolean org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(java.lang.String, int)'
    at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
    at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:645)
    at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:1230)
    at org.apache.hadoop.fs.FileUtil.list(FileUtil.java:1435)
    at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:493)

Winutils is being picked up because if I delete it the first example above then breaks as well in the expected way.

It seems that winutils is incompatible with Spark 3.1.1 and specifically a folder of files? I find it hard to believe that.

Bizarrely though I have another machine with pySpark 3.1.1 and this version of winutils and it works! Same Java version as well. I'm lost - I've even copied the winutils files from teh working machine to this one and it still didn't work.

Can anyone guide me on what that error means at least to help me understand where the issue could be?

Upvotes: 1

Views: 4977

Answers (1)

Sarthak Agrawal
Sarthak Agrawal

Reputation: 586

In my case, I was able to resolve the issue by adding the hadoop.dll from the same link as provided in the question above(here) to the same location as winutils. Just keep a check there is no version mismatch.

Upvotes: 3

Related Questions