Reputation: 2551
We're calling Azure Databricks notebooks from an Data Factory pipeline that performs ingestion into Azure Synapse. But whenever a notebook run fails, it just shows the below error message:
com.databricks.spark.sqldw.SqlDWSideException: Azure Synapse Analytics failed to execute the JDBC query produced by the connector.
But when we go into the run log and scroll down to this exception message, just below this message, there would be
Underlying SQLException(s):
- com.microsoft.sqlserver.jdbc.SQLServerException: HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: HadoopExecutionException: The column [4] is not nullable and also USE_DEFAULT_VALUE is false, thus empty input is not allowed. [ErrorCode = 107090] [SQLState = S0001]
or sometimes it would be like:
Underlying SQLException(s):
- com.microsoft.sqlserver.jdbc.SQLServerException: HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: HadoopExecutionException: String or Binary would be truncated
The code we're using to ingest the data is:
try:
data.write.format('com.databricks.spark.sqldw').option("url", connection_string).option("dbTable", table) \
.option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver") \
.option("tempDir", Connection.storageaccount_path + 'store/dataload')
.save(mode="append")
except Exception as e:
raise Exception("error took place for table: " + table + " : " + str(e))
So, this Underlying SQLException(s):
is the actual error message that tells what went wrong. But it never gets shown in the runError
output that we see on ADF pipeline. Due to this, it's impossible for us to identify the errors in bulk using Azure Log Analytics
. We always have to manually scroll down into the error log one failure after another.
We have thousands of runs happening on Production daily and many pipelines fail regularly. But due to this limitation in seeing the exact error messages, we're unable to monitor the failures efficiently.
Is there a way to make Databricks output the Underlying SQLException(s):
instead of the generic message: com.databricks.spark.sqldw.SqlDWSideException: Azure Synapse Analytics failed to execute the JDBC query produced by the connector.
Upvotes: 0
Views: 145
Reputation: 804
If you're using DBR >= 12.2 you can now use an improved error handling type per the Error handling in Databricks documentation
from pyspark.errors import PySparkException
try:
spark.sql("SELECT * FROM does_not_exist").show()
except PySparkException as ex:
print("Error Class : " + ex.getErrorClass())
print("Message parameters: " + str(ex.getMessageParameters()))
print("SQLSTATE : " + ex.getSqlState())
print(ex)
Which yields:
Error Class : TABLE_OR_VIEW_NOT_FOUND
Message parameters: {'relationName': '`does_not_exist`'}
SQLSTATE : 42P01
[TABLE_OR_VIEW_NOT_FOUND] The table or view `does_not_exist` cannot be found. Verify the spelling and correctness of the schema and catalog.
If you did not qualify the name with a schema, verify the current_schema() output, or qualify the name with the correct schema and catalog.
To tolerate the error on drop use DROP VIEW IF EXISTS or DROP TABLE IF EXISTS. SQLSTATE: 42P01; line 1 pos 14;
'Project [*]
+- 'UnresolvedRelation [does_not_exist], [], false
Upvotes: 0
Reputation: 8402
To get Actual error message instead of generic error message you need to split it with the appropriate delimiter and get the actual error message according to index no.
try:
df1.write.format('com.databricks.spark.sqldw').option("url", "URL").option("dbTable", "demo4") \
.option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver") \
.option("tempDir", "tempdir") \
.option("forwardSparkAzureStorageCredentials", "true") \
.save(mode="append")
except Exception as e:
error_message = str(e)
error_parts = error_message.split('\n')
print("Error occurred:",error_parts[2],error_parts[3],error_parts[4])
Here I spilt the error message with the \n
(new line) to array and to get the expected result I called out the 2nd, 3rd, 4th index element of split array.
My execution:
Upvotes: 0