Time-travel in a Managed Table with Pyspark

Question

all tables created in Databricks are Delta tables, by default.

I create a table with

df.write.saveAsTable("table_name")

With the sql api I can time-travel:

%sql
SELECT * FROM table_name VERSION AS OF 0

How can I now time-travel with python? I search for something like

spark.table("mytab2").versionAsOf(3)

Kombajn zbożowy · Accepted Answer

Simplest way:

spark.table("mytab2@v3")  # as of version

or

spark.table("mytab2@20221012093243000")  # as of timestamp

Reference: Table batch read and writes / @ syntax. On the same page there's also an option for DataFrameReader API, although for this you need to provide explicit DBFS path to Delta table, so it's a bit less convenient.

Time-travel in a Managed Table with Pyspark

Answers (2)

Related Questions