break DAG lineage in DLT

Question

I have an iterative transformation applied to a dataframe, it used to take a long time and having done lots of research online, it appears the issue was due to the DAG from growing exponentially. To fix this I cme across a solution which was to break the lineage by converting the dataframe to RDD and then back to pyspark again after each transformation in a loop. This works wonders when applied to a normal table, but now I'm using DLT and I'm getting this error:

Queries with streaming sources must be executed with writeStream.start();

Is there any way to resolve this?

break DAG lineage in DLT

Answers (1)

Related Questions