Reputation: 8342
I want to specify the AWS_SECRET_ACCESS_KEY
and AWS_ACCESS_KEY_ID
at run-time.
I already tried using
hadoop -Dfs.s3a.access.key=${AWS_ACESS_KEY_ID} -Dfs.s3a.secret.key=${AWS_SECRET_ACCESS_KEY} fs -ls s3a://my_bucket/
and
export HADOOP_CLIENT_OPTS="-Dfs.s3a.access.key=${AWS_ACCESS_KEY_ID} -Dfs.s3a.secret.key=${AWS_SECRET_ACCESS_KEY}"
and
export HADOOP_OPTS="-Dfs.s3a.access.key=${AWS_ACCESS_KEY_ID} -Dfs.s3a.secret.key=${AWS_SECRET_ACCESS_KEY}"
In the last two examples, I tried to run with:
hadoop fs -ls s3a://my-bucket/
In all the cases I got:
-ls: Fatal internal error
com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain
at com.amazonaws.auth.AWSCredentialsProviderChain.getCredentials(AWSCredentialsProviderChain.java:117)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3521)
at com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1031)
at com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:994)
at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:297)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:325)
at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:235)
at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:218)
at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:201)
at org.apache.hadoop.fs.shell.Command.run(Command.java:165)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:287)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:340)
What am doing wrong?
Upvotes: 1
Views: 6774
Reputation: 13430
I think part of the problem is that, confusingly, unlike the JVM -D
opts, the Hadoop -D
command expects a space between the -D and the key, e.g:
hadoop fs -ls -D fs.s3a.access.key=AAIIED s3a://landsat-pds/
I would still avoid doing that on the command line though, as anyone who can do a ps
command can see your secrets.
Generally we stick them into core-site.xml
when running outside EC2; in EC2 it's handled magically
Upvotes: 0
Reputation: 18270
This is a correct way to pass the credentials at runtime,
hadoop fs -Dfs.s3a.access.key=${AWS_ACCESS_KEY_ID} -Dfs.s3a.secret.key=${AWS_SECRET_ACCESS_KEY} -ls s3a://my_bucket/
Your syntax needs a small fix. Make sure that empty strings are not passed as the values to these properties. It would make these runtime properties invalid and would go on searching for the credentials as per the authentication chain.
The S3A client follows the following authentication chain:
- If login details were provided in the filesystem URI, a warning is printed and then the username and password extracted for the AWS key and secret respectively.
- The
fs.s3a.access.key
andfs.s3a.secret.key
are looked for in the Hadoop XML configuration.- The AWS environment variables are then looked for.
- An attempt is made to query the Amazon EC2 Instance Metadata Service to retrieve credentials published to EC2 VMs.
The other possible methods to pass the credentials at runtime (please note that it is neither safe nor recommended to supply them during runtime),
1) Embed them in the S3 URI
hdfs dfs -ls s3a://AWS_ACCESS_KEY_ID:AWS_SECRET_ACCESS_KEY@my-bucket/
If the secret key contains any +
or /
symbols, escape them with %2B
and %2F
respectively.
Never share the URL, logs generated using it, or use such an inline authentication mechanism in production.
2) export
environment variables for the session
export AWS_ACCESS_KEY_ID=<YOUR_AWS_ACCESS_KEY_ID>
export AWS_SECRET_ACCESS_KEY=<YOUR_AWS_SECRET_ACCESS_KEY>
hdfs dfs -ls s3a://my-bucket/
Upvotes: 7