Reputation: 141
I am looking for a Solution through which we can load the data from Azure SQL DB to Azure Synapse Spark Datalake (not in dedicated pool).
The Requirements are:
Thanks in Advance for your Responses
Upvotes: 0
Views: 1176
Reputation: 11329
You can do it by following ways.
By connecting Azure SQL Database to Synapse notebook via JDBC connection:
First Go to SQL Database and in the connection strings, copy the JDBC credentials.
In this approach, for every new data you should have a column last_modified date
which helps in getting new data.
Now in Synapse notebook use the following code.
jdbcHostname = "rakeshserver.database.windows.net"
jdbcDatabase = "rakeshsqldatbase"
jdbcPort = "1433"
username = "rakeshadmin"
password = "< your password >"
jdbcUrl = "jdbc:sqlserver://{0}:{1};database={2}".format(jdbcHostname, jdbcPort, jdbcDatabase)
connectionProperties = {
"user" : username,
"password" : password,
"driver" : "com.microsoft.sqlserver.jdbc.SQLServerDriver"
}
pushdown_query = "(select * from ok where last_modified>='2022-10-01' and last_modified<='2022-11-01') ok2"
df = spark.read.jdbc(url=jdbcUrl, table=pushdown_query, properties=connectionProperties)
display(df)
In the synapse SQL code, get the data from the table like above. Schedule this notebook every day by using Notebook activity. For the above dates to be dynamic, you can pass those dates from Notebook activity.
My Execution of dataframe:
By using Lookup to SQL and passing the JSON to Notebook activity as parameter:
First Use lookup query (use last_modified date) to get the desired results as json and then pass this output array as parameter to the Synapse notebook using Notebook activity. You can read that as dataframe in synapse code.
Upvotes: 0