Ashvani Jaiswal
Ashvani Jaiswal

Reputation: 110

Delta lake transaction logs adding records in bulk and deleting any record

Suppose I have inserted one record in empty delta table then one parquet file is created and one transaction00.json log will create. transaction00.json will contain,

  transaction00.json -- {add- parquet1 file name}

I have inserted one more record then one new parquet file will create and transaction01.json will contain -

{add -  parquet2 file name}

Now I delete 2nd record from delta table then transaction02.json will create and it contains

{remove-   parquet2 file name}
{add- parquet3 file name}

What about when I insert 20 records at a time and after that I have deleted 5 records? I know there will be only one parquet file will create for 20 records and one transaction.json log file but not sure about the deletion operation.

Can you please explain what transaction log will contain if you delete 5 records?

Upvotes: 1

Views: 516

Answers (1)

Alex Ott
Alex Ott

Reputation: 87299

When you delete data from Delta, it do the following operations:

  1. Identify files that are matching your delete conditions
  2. Rewrite the identified files, removing matching records, and generating new files

As result, transaction log will contain remove operation for all files that contained records matching your condition, and add operation for newly generated files.

Upvotes: 1

Related Questions