Reputation: 162
I am trying to connect to an Azure SQL Database from inside Azure Machine Learning Studio. Based on https://learn.microsoft.com/en-us/python/api/azureml-core/azureml.core.datastore.datastore?view=azure-ml-py, it seems that the recommended pattern is to create a Datastore using the Datastore.register_azure_sql_database method as follows:
import os
from azureml.core import Workspace, Datastore
ws = Workspace.from_config() # asks for interactive authentication the first time
sql_datastore_name = "datastore_test_01" # any name should be fine
server_name = os.getenv("SQL_SERVERNAME" , "{SQL_SERVERNAME}") # Name of the Azure SQL server
database_name = os.getenv("SQL_DATABASENAME" , "{SQL_DATABASENAME}") # Name of the Azure SQL database
username = os.getenv("SQL_USER_NAME" , "{SQL_USER_NAME}") # The username of the database user.
password = os.getenv("SQL_USER_PASSWORD" , "{SQL_USER_PASSWORD}") # The password of the database user.
sql_datastore = Datastore.register_azure_sql_database(workspace = ws,
datastore_name = sql_datastore_name,
server_name = server_name,
database_name = database_name,
username = username,
password = password)
I am pretty sure I have set all parameters right, having copied them from the ADO.NET connection string at my SQL Database resource --> Settings --> Connection strings:
Server=tcp:{SQL_SERVERNAME}.database.windows.net,1433;Initial Catalog={SQL_DATABASENAME};Persist Security Info=False;User ID={SQL_USER_NAME};Password={SQL_USER_PASSWORD};MultipleActiveResultSets=False;Encrypt=True;TrustServerCertificate=False;Connection Timeout=30;
However, I get the following error:
Registering datastore failed with a 400 error code and error message 'Azure SQL Database Error -2146232060: Please check the correctness of the datastore information.'
Am I missing something? E.g., a firewall rule? I have also tried adding the Azure ML compute resource's public IP address to the list of allowed IP addresses in my SQL Database resource, but still no success.
UPDATE: adding skip_validation = True
to Datastore.register_azure_sql_database
solves the issue. I can then query the data with
from azureml.core import Dataset
from azureml.data.datapath import DataPath
query = DataPath(sql_datastore, 'SELECT * FROM my_table')
tabular = Dataset.Tabular.from_sql_query(query, query_timeout = 10)
df = tabular.to_pandas_dataframe()
Upvotes: 1
Views: 1234
Reputation: 501
is the datastore behind vnet? where are you running the registration code above? On a compute instance behind the same vnet? here is the doc that describe what you need to do to connect to data behind vnet: https://learn.microsoft.com/en-us/azure/machine-learning/how-to-enable-virtual-network#use-datastores-and-datasets
Upvotes: 2