Reputation: 21
I used Workload Identity from AWS EC2 to GCP Bigquery by using assigned role on EC2, and it worked fine.
However when I use Workload Identity from AWS Fargete to GCP Bigquery by using fargate task role, it does not work.
How should I set up the Workload Identity on this case?
I used the libraries below.
implementation(platform("com.google.cloud:libraries-bom:20.9.0"))
implementation("com.google.cloud:google-cloud-bigquery")
Stacktrace has messages below
com.google.cloud.bigquery.BigQueryException: Failed to retrieve AWS IAM role.
at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.translate(HttpBigQueryRpc.java:115) ~[google-cloud-bigquery-1.137.1.jar!/:1.137.1]
…
at java.base/java.lang.Thread.run(Unknown Source) ~[na:na]
Caused by: java.io.IOException: Failed to retrieve AWS IAM role.
at com.google.auth.oauth2.AwsCredentials.retrieveResource(AwsCredentials.java:217) ~[google-auth-library-oauth2-http-0.26.0.jar!/:na]
…
at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.getDataset(HttpBigQueryRpc.java:126) ~[google-cloud-bigquery-1.137.1.jar!/:1.137.1]
... 113 common frames omitted
Caused by: java.net.ConnectException: Invalid argument (connect failed)
at java.base/java.net.PlainSocketImpl.socketConnect(Native Method) ~[na:na]
at com.google.auth.oauth2.AwsCredentials.retrieveResource(AwsCredentials.java:214) ~[google-auth-library-oauth2-http-0.26.0.jar!/:na]
... 132 common frames omitted
Upvotes: 2
Views: 2080
Reputation: 51
I faced a similar issue with Google Cloud Storage (GCS).
As Peter mentioned, retrieving the credentials on an AWS Farage task is not the same as if the code is running on an EC2 instance, therefore Google SDK fails to compose the correct AWS credentials for exchange with Google Workload Identity Federation.
I came up with a workaround that saved the trouble of editing core files in "../google/auth/aws.py" by doing 2 things:
import boto3
task_credentials = boto3.Session().get_credentials().get_frozen_credentials()
from google.auth.aws import environment_vars
os.environ[environment_vars.AWS_ACCESS_KEY_ID] = task_credentials.access_key
os.environ[environment_vars.AWS_SECRET_ACCESS_KEY] = task_credentials.secret_key
os.environ[environment_vars.AWS_SESSION_TOKEN] = task_credentials.token
Explanation:
I am using Python3.9 with boto3 and google-cloud==2.4.0, however it should work for other versions of google SDK if the following code is in the function "_get_security_credentials" under the class "Credentials" in "google.auth.aws" package:
# Check environment variables for permanent credentials first.
# https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html
env_aws_access_key_id = os.environ.get(environment_vars.AWS_ACCESS_KEY_ID)
env_aws_secret_access_key = os.environ.get(
environment_vars.AWS_SECRET_ACCESS_KEY
)
# This is normally not available for permanent credentials.
env_aws_session_token = os.environ.get(environment_vars.AWS_SESSION_TOKEN)
if env_aws_access_key_id and env_aws_secret_access_key:
return {
"access_key_id": env_aws_access_key_id,
"secret_access_key": env_aws_secret_access_key,
"security_token": env_aws_session_token,
}
Caveat:
When running code inside an ECS task the credentials that are being used are temporary (ECS assumes the task's role), therefore you can't generate temporary credentials via AWS STS as it is usually recommended.
Why is it a problem? Well since a task is running with temporary credentials it is subjected to expire & refresh. In order to solve that you can set up a background function that will do the operation again every 5 minutes or so (Haven't faced a problem where the temporary credentials expired).
Upvotes: 3
Reputation: 1
I had the same issue but for Python code, anyway I think it should be the same. You're getting this as getting the AWS IAM role at AWS Fargate is different from AWS EC2, where EC2 you can get them from instance metadata, as shown here:
curl http://169.254.169.254/latest/meta-data/iam/security-credentials/s3access
While in AWS Faragte:
curl 169.254.170.2$AWS_CONTAINER_CREDENTIALS_RELATIVE_URI
So to get around that, the following need to be done:
wif_cred_file["credential_source"]["url"]=f"http://169.254.170.2{AWS_CONTAINER_CREDENTIALS_RELATIVE_URI}"
# role_name = self._get_metadata_role_name(request)
role_name
from _get_metadata_security_credentials
function args.Or if you like, you may change step 1 at the aws.py file, both ways should be fine.
And that should be it.
Upvotes: 0