Apache Iceberg table from Spark Explain Plan

Question

How we can check the query is running fine or not in terms of accessing partition. Is there anything we can run explain plan for the iceberg table.

Example: I have created iceberg table using partition on month(tpep_pickup_datetime).

Query I'm running from spark is

df = spark.sql("select *  from iceberg.nyc_yellowtaxi_tripdata_v2 where tpep_pickup_datetime = '2022-01-01 00:35:40' ")

I just want to make sure that partition is working fine or not. Which partition has been accessed or is there any full table scan. I have tried running df.explain(), but it is not giving any partition information on filters added.

    Spark Running
== Physical Plan ==
*(1) Filter (isnotnull(tpep_pickup_datetime#217) AND (tpep_pickup_datetime#217 = 2022-01-01 00:35:40))
+- *(1) ColumnarToRow
   +- BatchScan[vendorid#216L, tpep_pickup_datetime#217, tpep_dropoff_datetime#218, passenger_count#219, trip_distance#220, ratecodeid#221, store_and_fwd_flag#222, pulocationid#223L, dolocationid#224L, payment_type#225L, fare_amount#226, extra#227, mta_tax#228, tip_amount#229, tolls_amount#230, improvement_surcharge#231, total_amount#232, congestion_surcharge#233, airport_fee#234] iceberg.nyc_yellowtaxi_tripdata_v2 [filters=tpep_pickup_datetime IS NOT NULL, tpep_pickup_datetime = 1640997340000000] RuntimeFilters: []

Apache Iceberg table from Spark Explain Plan

Answers (1)

Related Questions