Reputation: 1486
I have a Spark job which writes its results into s3 bucket, the thing is when the output bucket name looks like this s3a://bucket_name/ I get an error
Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 404, AWS Service: Amazon S3, AWS Request ID: xxx, AWS Error Code: NoSuchKey, AWS Error Message: null, S3 Extended Request ID: xxx
but when I add a subfolder inside the output bucket (s3a://bucket_name/subfolder/) it works!
I'm using hadoop-aws 2.7.3 to read from s3.
what is the problem?
Thanks in advance.
Upvotes: 1
Views: 668
Reputation: 13430
Not a spark bug. Issue in how the S3 clients work with root directories. they are "special". HADOOP-13402 sort of looks at it. The code you have there is clearly from Amazon's own object store client, but it clearly behaves the same way.
To consider it differently: you wouldn't commit work to "file:///" or "hdfs:///"; everything expects a subdirectory.
Sorry.
Upvotes: 1