Ramesh Bathini
Ramesh Bathini

Reputation: 43

Why reading of excel file does not works with Crealytics version spark-excel_2.12-3.5.0_0.20.1

I could able to read the Excel file data using Crealytics library spark-excel_2.12-3.4.1_0.19.0 but was not able to execute the same code by using the latest version spark-excel_2.12-3.5.0_0.20.1.

I tried the below code but none of the code works with latest library spark-excel_2.12-3.5.0_0.20.1.

df=spark.read.format("com.crealytics.spark.excel").option("inferschema",True).option("header", True).option("ignoreLeadingWhiteSpace", "true").option("ignoreTrailingWhiteSpace", "true").option("keepUndefinedRows", True).option("dataAddress","'SHEET_NAME'!").load(path)
   
df=spark.read.format("excel").option("inferschema",True).option("header", True).option("ignoreLeadingWhiteSpace", "true").option("ignoreTrailingWhiteSpace", "true").option("keepUndefinedRows", True).option("dataAddress","'SHEET_NAME'!").load(path)

It is throwing follwoing error when used latest Crealytics

Py4JJavaError: An error occurred while calling o412.load.
: java.lang.ClassNotFoundException: 
Failed to find data source: excel. Please find packages at
https://spark.apache.org/third-party-projects.html
       
    at org.apache.spark.sql.errors.QueryExecutionErrors$.failedToFindDataSourceError(QueryExecutionErrors.scala:837)
    at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:744)
    at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSourceV2(DataSource.scala:794)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:328)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:237)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
    at py4j.Gateway.invoke(Gateway.java:306)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:195)
    at py4j.ClientServerConnection.run(ClientServerConnection.java:115)
    at java.lang.Thread.run(Thread.java:750)
Caused by: java.lang.ClassNotFoundException: excel.DefaultSource
    at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
    at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$5(DataSource.scala:730)
    at scala.util.Try$.apply(Try.scala:213)
    at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$4(DataSource.scala:730)
    at scala.util.Failure.orElse(Try.scala:224)
    at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:730)
    ... 15 more

Note that,Executing the above code in Databrciks Python cell with scala version 2.x

My question here is why it is not working when upgraded to latest version of crealytics?

Upvotes: 0

Views: 542

Answers (1)

Ramesh Bathini
Ramesh Bathini

Reputation: 43

Sorry for wasting your time here. I found that there is an issue with library and it was reported. Here is the link for the same

https://github.com/crealytics/spark-excel/issues/789

Upvotes: 0

Related Questions