Delete redshift table from within databricks using pyspark

Question

I tried to connect to a redshift system table called stv_sessions and I can read the data into a dataframe.

This stv_sessions table is a redshift system table which has the process id's of all the queries that are currently running.

To delete a query from running we can do this.

select pg_terminate_backend(pid)

While this works for me if I directly connect to redshift (using aginity), it gives me insuffecient previlege issues when trying to run from databricks.

Simply put I dont know how to run the query from databricks notebook.

I have tried this so far,

kill_query = "select pg_terminate_backend('12345')" 

some_random_df_i_created.write.format("com.databricks.spark.redshift").option("url",redshift_url).option("dbtable","stv_sessions").option("tempdir", temp_dir_loc).option("forward_spark_s3_credentials", True).options("preactions", kill_query).mode("append").save()

Please let me know if the methodology i follow is correct.

Thank you

Delete redshift table from within databricks using pyspark

Answers (1)

Related Questions