Reputation: 89
I have following spark program which i am trying to run the purpose of it is to convert copybok file into Parquet file.(Program link of Cobrix https://github.com/AbsaOSS/cobrix) in this i am just trying to run a file called CobolCopybookExample1.scala (this is inside cobrix-master\cobrix-master\spark-cobol\src\main\scala\za\co\absa\cobrix\spark\cobol\examples)
Its source file copybook file is in (cobrix-master\cobrix-master\examples\example_data)
We know that Spark does not provide inbuilt library for Copybook data transformation. So for that there is an open source lib called Cobrix which is certified by Apache Spark Foundation, this is what i am using in my program.
Following steps i have followed so far and got into error.
I needed 4 prerequisite jar files they are
spark-cobol-0.3.0.jar
cobol-parser-0.3.0.jar
scodec-core_2.11-1.10.3.jar
scodec-bits_2.11-1.1.4.jar
1). I downloaded these jars and kept it in my VM Desktop Folder Cobol
2). I launched spark-shell with a following command from the jar location. It launched successfully.
spark-shell --master yarn --deploy-mode client --driver-cores 4 --driver-memory 4G --jars spark-cobol-0.3.0.jar,cobol-parser-0.3.0.jar,scodec-core_2.11-1.10.3.jar,scodec-bits_2.11-1.1.4.jar
3) Now i needed to import 2 libraries before i could launch my spark Reader function. So i did
import org.apache.spark.sql.{SaveMode, SparkSession}
import za.co.absa.cobrix.spark.cobol.utils.SparkUtils
4) Now i had to launch my spark DF and got errors which i have mentioned down in this email. It appears to me an environmental error, but i would like to take your advise on it. I am struggling to resolve them.
val df = spark.read
.format("za.co.absa.cobrix.spark.cobol.source")
.option("copybook", "file:///home/bigdata/Desktop/Cobol/example_data/raw_file.cob")
.load("file:///home/bigdata/Desktop/Cobol/example_data/raw_data")
after that i am getting this error
java.lang.NoClassDefFoundError: java/time/temporal/TemporalAccessor
at za.co.absa.cobrix.spark.cobol.reader.fixedlen.FixedLenNestedReader.loadCopyBook(FixedLenNestedReader.scala:76)
at za.co.absa.cobrix.spark.cobol.reader.fixedlen.FixedLenNestedReader.<init>(FixedLenNestedReader.scala:42)
at za.co.absa.cobrix.spark.cobol.source.DefaultSource.createFixedLengthReader(DefaultSource.scala:83)
at za.co.absa.cobrix.spark.cobol.source.DefaultSource.buildEitherReader(DefaultSource.scala:70)
at za.co.absa.cobrix.spark.cobol.source.DefaultSource.createRelation(DefaultSource.scala:54)
at za.co.absa.cobrix.spark.cobol.source.DefaultSource.createRelation(DefaultSource.scala:45)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:330)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:152)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:135)
... 50 elided
Caused by: java.lang.ClassNotFoundException: java.time.temporal.TemporalAccessor
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 59 more
I am also attaching my Scala Program
Upvotes: 3
Views: 3050
Reputation: 767
it seems some dependencies are missing or spark submit not including the jars which you are passing through --jars.
Can you try after movjng these jars in spark lib
Upvotes: 0
Reputation: 485
Could you try to rebuild your Scala project with SBT or Maven, here is an interesting article. You need a Fat JAR (with this you won't need --jars
in the spark submit
). The error seems to be because one of the JARS have dependencies of another JARs. Check for example that Cobol Parser needs some Compile dependencies in order to work correctly.
Upvotes: 2