Reputation: 1
I'm exploring Delta-RS to convert Pandas DataFrames into Delta tables and leverage its ACID compliance, particularly testing atomicity. Atomicity ensures that either an entire operation is completed, or no changes are made at all.
To test this, I wrote a function that intentionally raises an exception before calling write_deltalake
. My expectation was that no changes would occur, and no files would be written if the exception was raised. However, this isn't what I observed, and I need clarification on why this behavior is happening.
Here’s the function I used:
def create_dim(df: pd.DataFrame, table_path: str):
raise Exception("Raising Error") # Intentionally raise an exception before writing
write_deltalake(table_path, df, mode="overwrite", storage_options=STORAGE_OPTIONS)
Expected Outcome:
Since the raise Exception
statement comes before the write_deltalake
call, I expected that:
No folder would be created at the table_path
in my S3 bucket.
No Delta log or Parquet file would be written.
Actual Outcome:
When I executed the function (by calling create_dim(..., ...)
), the following happened:
A folder was created at the specified table_path
.
Both a Delta log and a Parquet file were written to the S3 bucket.
This behavior seems counterintuitive, as I expected Delta Lake's atomicity guarantee to ensure that no changes would occur unless the entire transaction succeeded.
Why are files being written even though the exception is raised before the write_deltalake
function call? Am I misunderstanding Delta-RS's behavior or Python's execution model?
Any guidance would be greatly appreciated!
Upvotes: 0
Views: 37