Reputation: 73
we are using standalone cluster for Spark 2.4 version. we need to upgrade AWS SDK version to 1.12.654 (not ready to AWS V2 yet) and because of that we upgrade to hadoop 3.1.1. After facing lot of issues with dependencies , we are stuck with java.lang.NoSuchMethodError error which we are unable to figure it out.
exception is
Caused by: java.lang.NoSuchMethodError: com.amazonaws.http.HttpResponse.getHttpRequest()Lcom/amazonaws/thirdparty/apache/http/client/methods/HttpRequestBase;
at com.amazonaws.services.s3.internal.S3ObjectResponseHandler.handle(S3ObjectResponseHandler.java:57)
at com.amazonaws.services.s3.internal.S3ObjectResponseHandler.handle(S3ObjectResponseHandler.java:29)
at com.amazonaws.http.response.AwsResponseHandlerAdapter.handle(AwsResponseHandlerAdapter.java:69)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleResponse(AmazonHttpClient.java:1794)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleSuccessResponse(AmazonHttpClient.java:1477)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1384)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1157)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:814)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5520)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5467)
at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1554)
at org.apache.hadoop.fs.s3a.S3AInputStream.lambda$reopen$0(S3AInputStream.java:183)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109)
at org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:182)
at org.apache.hadoop.fs.s3a.S3AInputStream.lambda$lazySeek$1(S3AInputStream.java:328)
at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$2(Invoker.java:190)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109)
at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$3(Invoker.java:260)
at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:317)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:256)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:188)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:210)
at org.apache.hadoop.fs.s3a.S3AInputStream.lazySeek(S3AInputStream.java:321)
at org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:433)
at java.io.DataInputStream.read(DataInputStream.java:149)
at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.fillBuffer(UncompressedSplitLineReader.java:62)
at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:218)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:176)
at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.readLine(UncompressedSplitLineReader.java:94)
at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:124)
at org.apache.spark.sql.execution.datasources.HadoopFileLinesReader.<init>(HadoopFileLinesReader.scala:65)
at org.apache.spark.sql.execution.datasources.HadoopFileLinesReader.<init>(HadoopFileLinesReader.scala:47)
at org.apache.spark.sql.execution.datasources.csv.TextInputCSVDataSource$.readFile(CSVDataSource.scala:199)
at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat$$anonfun$buildReader$2.apply(CSVFileFormat.scala:142)
at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat$$anonfun$buildReader$2.apply(CSVFileFormat.scala:136)
at org.apache.spark.sql.execution.datasources.FileFormat$$anon$1.apply(FileFormat.scala:148)
at org.apache.spark.sql.execution.datasources.FileFormat$$anon$1.apply(FileFormat.scala:132)
when we look at the jar versions we have S3ObjectResponseHandler is from aws-java-sdk-s3-1.12.654.jar calling HttpResponse which is from aws-java-sdk-core-1.12.654.jar.
when i look at imports in HttpResponse class it has
import org.apache.http.client.methods.HttpRequestBase;
so that class seems to be from HttpClients dependency but the exception is thrown at com/amazonaws/thirdparty/apache/http/client/methods/HttpRequestBase
which i found it in aws-java-sdk-bundle-1.12.654.jar . i am not really sure how to fix it. i have excluded httpclients from dependency, try to force using aws-java-sdk-bundle. its been a week i am stuck. not sure what i am missing.
any pointers would be appreciated.
update.
i see in aws bundle pom file below. i am guessing because of it showing the package name as com.amazonaws.thirdparty.apache.http , when exception is throw. but still i am not sure why that method is not found, when aws bundle is in class path
<relocations>
<relocation>
<pattern>org.apache.http</pattern>
<shadedPattern>com.amazonaws.thirdparty.apache.http</shadedPattern>
</relocation>
</relocations>
Thanks.
Upvotes: 0
Views: 422
Reputation: 13480
hadoop-aws uses "aws-java-sdk-bundle" jar as its dependency to avoid problems with inconsistencies within the aws artifacts. Do the same.
see: https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-aws
Recommend upgrading to the 3.3.6 release of hadoop too, for all hadoop-* JARs too.
Upvotes: 0