WinUtils & Spark 3.1.1 Failures

Question

I'm trying to create a Windows 10 developer vm with a Conda environment and PySpark but seeing constant problems with getting Spark & winutils to work.

Environment:

Windows 10 19042 (Fully Patched)
MiniConda
PySaprk 3.1.1
Java 11.0.11

I have created C:\Hadoop\bin and downloaded winutils from here https://github.com/cdarlint/winutils/tree/master/hadoop-3.2.1/bin (I've also tried 3.2.0).

HADOOP_HOME is C:\Hadoop and Path has %HADOOP_HOME\bin in it. JAVA_HOME is correct.

This code works:

location = 'C:/myfiles/file.csv'
df = spark.read.format("csv").options(header=True).load(location)

This code fails:

location = 'C:/myfiles/'
df = spark.read.format("csv").options(header=True).load(location)

Error message:

An error occurred while calling o35.load.
: java.lang.UnsatisfiedLinkError: 'boolean org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(java.lang.String, int)'
    at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
    at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:645)
    at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:1230)
    at org.apache.hadoop.fs.FileUtil.list(FileUtil.java:1435)
    at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:493)

Winutils is being picked up because if I delete it the first example above then breaks as well in the expected way.

It seems that winutils is incompatible with Spark 3.1.1 and specifically a folder of files? I find it hard to believe that.

Bizarrely though I have another machine with pySpark 3.1.1 and this version of winutils and it works! Same Java version as well. I'm lost - I've even copied the winutils files from teh working machine to this one and it still didn't work.

Can anyone guide me on what that error means at least to help me understand where the issue could be?

WinUtils & Spark 3.1.1 Failures

Answers (1)

Related Questions

WinUtils &amp; Spark 3.1.1 Failures

Answers (1)

Related Questions

WinUtils & Spark 3.1.1 Failures