Peeter Bonomo
Peeter Bonomo

Reputation: 21

Moving data from Google Big Query to Azure Data Lake Store using Azure Data Factory

I have a scenario where I need to connect the Azure Data Factory (v2) in Google Big Query for to move data to my Azure Data Lake but I don't work.

When I create a Linked Service, I choose Big Query as Source, I put all informations about BigQuery, as project-name, type of connection, etc but when I click in Validade button a message is show for me: (for exemple)...

UserError: ERROR [28000] [Microsoft][BigQuery] (80) Authentication failed: invalid_grant ERROR [28000] [Microsoft][BigQuery] (80) Authentication failed: invalid_grant'Type=,Message=ERROR [28000] [Microsoft][BigQuery] (80) Authentication failed: invalid_grant ERROR [28000] [Microsoft][BigQuery] (80) Authentication failed: invalid_grant,Source=,', activityId: 05ff5ce1-5262-4794-853c-c9e39b7f4b71

Any idea? Someone already tested this connector?

Tks. Peeter Bonomo

Upvotes: 2

Views: 4908

Answers (3)

Gary
Gary

Reputation: 79

This is the error you get for any access issues unfortunately. It's also the error you get when your refresh token has expired which it always does after 60 minutes. Which is incredibly curious....this like so many sets of instructions on OAuth2.0 authentication for ADF never mention all this work is to get a code that expires in 60 minutes. Without some method of refreshing it everytime you connect this is worthless. At least the following link mentions this error and you get it because the token has expired...its the only blog post (or Microsoft documentation) that bothers to mention this super big issue. https://medium.com/@varunck/data-transfer-from-google-bigquery-to-azure-blob-storage-using-azure-data-factory-adf-80659c078f83

Here's a different method which is what I will be attempting...it uses service account and IR https://www.youtube.com/watch?v=oRqRt7ya_DM

Upvotes: 0

J. Peach
J. Peach

Reputation: 171

The documentation for the ADF connector to BigQuery explains what the parameters mean, but it doesn't give any guidance on how to obtain the values for those parameters.

I spent a couple of days on this and finally got it to work. I used "User Authentication" because I wanted to use a cloud-based IR. The "Client Id" and "Client Secret" can be obtained by creating new credentials in the GCP Console. But to get the "Refresh Token", you have to do the OAuth2 dance with Google and intercept the token.

I put together a PowerShell script to do this for you and wrote up a post on GitHub that walks you through the process of authenticating ADF v2 to Google BigQuery:

https://github.com/AnalyticJeremy/ADF_BigQuery

Upvotes: 8

Eva Xiao
Eva Xiao

Reputation: 47

According to https://learn.microsoft.com/en-us/azure/data-factory/connector-google-bigquery, to connect to Google BigQuery via cloud IR, you need to provide the below information:

{
    "name": "GoogleBigQueryLinkedService",
    "properties": {
        "type": "GoogleBigQuery",
        "typeProperties": {
            "project" : "<project ID>",
            "additionalProjects" : "<additional project IDs>",
            "requestGoogleDriveScope" : true,
            "authenticationType" : "UserAuthentication",
            "clientId": "<id of the application used to generate the refresh token>",
            "clientSecret": {
                "type": "SecureString",
                "value":"<secret of the application used to generate the refresh token>"
            },
            "refreshToken": {
                 "type": "SecureString",
                 "value": "<refresh token>"
            }
        }
    }
}

and that the user you use to grant permission to the application should have access to the project you want to query.

Thanks, Eva

Upvotes: -2

Related Questions