Jed Toner
Jed Toner

Reputation: 1

Trouble Authenticating bigrquery in R Within Docker on GitHub Actions Using Workload Identity Federation

I'm working on a GitHub Actions workflow that uses Workload Identity Federation (WIF) to authenticate with Google Cloud. As part of the workflow, I can successfully authenticate inside the container using gcloud CLI commands, but I can't get R's bigrquery package to pick up the credentials. Below is the relevant portion of my workflow file:

      # Step 3: Authenticate to Google Cloud via Workload Identity Federation
      - name: Authenticate to Google Cloud
        uses: google-github-actions/[email protected]
        id: auth
        with:
          workload_identity_provider: ${{ secrets.WIF_PROVIDER }}
          service_account: ${{ secrets.WIF_SERVICE_ACCOUNT }}
          token_format: "access_token"
          create_credentials_file: true

      # Step 5: Docker login to Artifact Registry using the WIF token
      - name: Docker login to Artifact Registry
        run: |
          echo ${{ steps.auth.outputs.access_token }} | docker login -u oauth2accesstoken --password-stdin https://${{ env.location }}-docker.pkg.dev

      # Step 6: Pull the specified Docker image
      - name: Pull Docker image
        run: docker pull ${{ env.location }}-docker.pkg.dev/${{ env.project }}/${{ env.repository }}/base-image:latest

      - name: Debug for google auth in R
        run: |
          docker run --rm \
            -v ${{ steps.auth.outputs.credentials_file_path }}:/gcp/creds.json:ro \
            -e GOOGLE_APPLICATION_CREDENTIALS=/gcp/creds.json \
            -e SUPABASE_API_URL="${{ secrets.SUPABASE_API_URL }}" \
            -e SUPABASE_SERVICE_KEY="${{ secrets.SUPABASE_SERVICE_KEY }}" \
            -e GCP_PROJECT=${{ env.project}} \
            ${{ env.location }}-docker.pkg.dev/${{ env.project }}/${{ env.repository }}/base-image:latest \
            sh -c '
              gcloud auth login --cred-file=/gcp/creds.json
              gcloud config set project $GCP_PROJECT
              bq ls --project_id=$GCP_PROJECT --format=prettyjson # works fine

              # The following R lines all fail to authenticate:
              # Rscript -e "library(bigrquery); bq_auth(use_oob = TRUE); print(bq_project_datasets($GCP_PROJECT))"
              # Rscript -e "library(bigrquery); bq_auth(path = \"/gcp/creds.json\"); print(bq_project_datasets($GCP_PROJECT))"
              # Rscript -e "library(bigrquery); library(gargle); gargle_token <- gargle::credentials_service_account(path = \"/gcp/creds.json\", scopes = \"https://www.googleapis.com/auth/cloud-platform\"); bq_auth(token = gargle_token); print(bq_project_datasets(${{ env.GCP_PROJECT }}))"
              # Rscript -e "library(bigrquery); library(gargle); gargle_token <- gargle::credentials_app_default(scopes = \"https://www.googleapis.com/auth/cloud-platform\"); bq_auth(token = gargle_token); print(bq_project_datasets(${{ env.GCP_PROJECT }}))"
              # Rscript -e "library(bigrquery); library(gargle); gargle_token <- gargle::credentials_external_account(path = Sys.getenv(\"GOOGLE_APPLICATION_CREDENTIALS\"), scopes = \"https://www.googleapis.com/auth/cloud-platform\"); bq_auth(token = gargle_token); print(bq_project_datasets(${{ env.GCP_PROJECT }}))"
            '

What works

Workload Identity Federation: GitHub Actions successfully acquires an access token and creates a credentials JSON file.

gcloud CLI (within docker):

gcloud auth login --cred-file=/gcp/creds.json
gcloud config set project ${{ env.GCP_PROJECT }}
bq ls --project_id=${{ env.GCP_PROJECT }} --format=prettyjson 

All of these commands work perfectly from within the container, demonstrating that authentication does succeed at the gcloud level.

What doesn't work

All attempts to authenticate within R using the bigrquery package (and underlying gargle package) fail. I've tried multiple approaches:

library(bigrquery)
bq_auth(use_oob = TRUE)

library(bigrquery)
bq_auth(use_oob = FALSE)

library(bigrquery)
bq_auth(path = "/gcp/creds.json")

library(bigrquery)
library(gargle)
gargle_token <- gargle::credentials_service_account(path = "/gcp/creds.json", 
    scopes = "https://www.googleapis.com/auth/cloud-platform")
bq_auth(token = gargle_token)

library(bigrquery)
library(gargle)
gargle_token <- gargle::credentials_external_account(path = Sys.getenv("GOOGLE_APPLICATION_CREDENTIALS"), 
    scopes = "https://www.googleapis.com/auth/cloud-platform")
bq_auth(token = gargle_token)

No matter which method I try, bq_project_datasets({{ env.GCP_PROJECT }}) returns an error indicating it can’t authenticate or can't find credentials. The most common error message I get is:

Error in `bq_auth()`:
! Can't get Google credentials.
ℹ Try calling `bq_auth()` directly with necessary specifics.

No matter which method I try, bq_project_datasets({{ env.GCP_PROJECT }}) returns an error indicating it can’t authenticate or can't find credentials.

Question: Why does gcloud successfully pick up the credentials but bigrquery doesn’t? How should I configure my R environment or credentials setup so that bigrquery recognizes the token/credentials generated via Workload Identity Federation? Is there something special about WIF-based credentials that requires a different approach in bigrquery? Any help or insights on how to properly configure bigrquery (especially in a non-interactive Docker environment) would be greatly appreciated!

Please note I do have to use WIF authentication in github actions - there is no way around this

Upvotes: 0

Views: 18

Answers (0)

Related Questions