user3579222
user3579222

Reputation: 1430

Time-travel in a Managed Table with Pyspark

In the databricks spec it is stated:

all tables created in Databricks are Delta tables, by default.

I create a table with

df.write.saveAsTable("table_name")

With the sql api I can time-travel:

%sql
SELECT * FROM table_name VERSION AS OF 0

How can I now time-travel with python? I search for something like

spark.table("mytab2").versionAsOf(3)

Upvotes: 1

Views: 326

Answers (2)

Powers
Powers

Reputation: 19328

This syntax also works:

spark.read.format("delta").option("versionAsOf", "0").table("mytab2")

Upvotes: 1

Kombajn zbożowy
Kombajn zbożowy

Reputation: 10703

Simplest way:

spark.table("mytab2@v3")  # as of version

or

spark.table("mytab2@20221012093243000")  # as of timestamp

Reference: Table batch read and writes / @ syntax. On the same page there's also an option for DataFrameReader API, although for this you need to provide explicit DBFS path to Delta table, so it's a bit less convenient.

Upvotes: 2

Related Questions