Reputation: 159
In Win10, in IntelliJ this path("C:/hive/Orders_[0-9]*.csv") works good when run as stand alone java spark job. But not working as Spring Boot spark job. Seems the spring boot not detecting native Filesystem. Not sure how to resolve this.
Dataset<Row> DF1 = spark
.read().format("csv")
.option("header", "true")
.option("delimiter", "\t")
.load("C:/hive/Orders_[0-9]*.csv");
Error:
Error starting ApplicationContext. To display the auto-configuration report re-run your application with 'debug' enabled.
2019-09-04 21:59:27.701 ERROR [omni-ods-migration,,,] 8216 --- [ main] o.s.boot.SpringApplication : Application startup failed
org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'odsMigrationService': Invocation of init method failed; nested exception is java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor.postProcessBeforeInitialization(InitDestroyAnnotationBeanPostProcessor.java:137)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.applyBeanPostProcessorsBeforeInitialization(AbstractAutowireCapableBeanFactory.java:409)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1620)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:555)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:483)
at org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:306)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:230)
at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:302)
at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:197)
at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:761)
at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:867)
at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:543)
at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:693)
at org.springframework.boot.SpringApplication.refreshContext(SpringApplication.java:360)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:303)
at org.springframework.boot.builder.SpringApplicationBuilder.run(SpringApplicationBuilder.java:134)
at com.jcpenney.ods.OdsMigration.main(OdsMigration.java:20)
Caused by: java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:645)
at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:1230)
at org.apache.hadoop.fs.FileUtil.list(FileUtil.java:1435)
at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:493)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1868)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1910)
at org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:678)
at org.apache.hadoop.fs.Globber.listStatus(Globber.java:77)
at org.apache.hadoop.fs.Globber.doGlob(Globber.java:235)
at org.apache.hadoop.fs.Globber.glob(Globber.java:149)
at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:2016)
at org.apache.spark.deploy.SparkHadoopUtil.globPath(SparkHadoopUtil.scala:241)
at org.apache.spark.deploy.SparkHadoopUtil.globPathIfNecessary(SparkHadoopUtil.scala:247)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$12.apply(DataSource.scala:383)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$12.apply(DataSource.scala:379)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.immutable.List.foreach(List.scala:392)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
at scala.collection.immutable.List.flatMap(List.scala:355)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:379)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:132)
at com.jcpenney.ods.service.OdsMigrationService.readHfsFile(OdsMigrationService.java:588)
at com.jcpenney.ods.service.OdsMigrationService.processOrders(OdsMigrationService.java:334)
at com.jcpenney.ods.service.OdsMigrationService.run(OdsMigrationService.java:129)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor$LifecycleElement.invoke(InitDestroyAnnotationBeanPostProcessor.java:366)
at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor$LifecycleMetadata.invokeInitMethods(InitDestroyAnnotationBeanPostProcessor.java:311)
at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor.postProcessBeforeInitialization(InitDestroyAnnotationBeanPostProcessor.java:134)
... 16 common frames omitted
below code works good in spring boot as well when the path is given with exact file name.
Dataset<Row> DF1 = spark
.read().format("csv")
.option("header", "true")
.option("delimiter", "\t")
.load("C:/hive/Orders_000001.csv");
how to fix this?
Upvotes: 3
Views: 7235
Reputation: 1236
Here is a possible solution
C:\hadoop\bin\winutils.exe
HADOOP_HOME
to C:\hadoop
%HADOOP_HOME%\bin
hadoop.dll
to Windows\System32
(may not be necessary)System.setProperty ("hadoop.home.dir", "C:/hadoop/" );
System.load ("C:/hadoop/bin/hadoop.dll");
References:
Upvotes: 8