Reputation: 735
I am new to spark. I have a excel file that I need to read into a Dataframe. I am using the com.crealytics.spark.excel
library to achieve this. The following is my code:
val df = hiveContext.read.format("com.crealytics.spark.excel")
.option("useHeader", "true")
.option("treatEmptyValuesAsNulls", "true")
.load("file:///data/home/items.xlsx")
The above code runs without any error. And I am also able to count the number of rows in the df
using df.count
. But when I try to print the df
using df.show
, it throws an error saying:
java.lang.NoSuchMethodError: scala.util.matching.Regex.unapplySeq(Ljava/lang/CharSequence;)Lscala/Option;
I am using Spark 1.6, Java 1.8 and scala 2.10.5.
I am not sure why this is happening. How do I solve this error and look at the data in the df
?
UPDATE:
I also tried using the StructType
to define the schema and impose it during the loading the data into df
:
val newschema = StructType(List(StructField("1", StringType, nullable = true),
StructField("2", StringType, nullable = true),
StructField("3", StringType, nullable = true),
StructField("4", StringType, nullable = true),
StructField("5", StringType, nullable = true),
StructField("6", StringType, nullable = true),
StructField("7", StringType, nullable = true),
StructField("8", StringType, nullable = true),
StructField("9", StringType, nullable = true),
StructField("10", StringType, nullable = true)))
val df = hiveContext.read.schema(newschema).format("com.crealytics.spark.excel")...
This doesn't help and I get the same error as before when I try to display the df
.
UPDATE-2:
I also tried loading the df
using SQLContext
. It still gives me the same error.
Any help would be appreciated. Thank you.
Upvotes: 0
Views: 1187
Reputation: 735
So, Apparently, com.crealytics.spark.excel
works with spark version 2.0 and above. updating my dependencies and running the jar using spark 2.0 gives the expected result without any errors.
I hope this helps somebody in the future.
Upvotes: 1