Mission
Mission

Reputation: 31

How to check a table is made from which tables in pyspark

I have a core layer where I have some tables and I want to find out by what tables in the source layer are these tables made up of. Like the tables in core layer are made by joining some of the tables of source layer. I want to generate an excel sheet using code so that I am able to display the core tables are made from which tables.

I am using PySpark on Databricks and the codes are written for creating the tables in notebooks.

Any help on how to approach this will be beneficial.

Upvotes: 1

Views: 355

Answers (1)

Alex Ott
Alex Ott

Reputation: 87249

This is possible when you use Databricks Unity Catalog - as part of it, there is a feature called Data Lineage that tracks what tables & columns were used to create a specific table and who are consumers of it as well. It also includes Lineage API that could be used for exporting of the lineage data.

Upvotes: 1

Related Questions