Another way of passing orderby list in pyspark windows method

Question

I just had a below concern in performing window operation on pyspark dataframe. I want to get the latest records from the input table with the below condition, but I want to exclude the for loop:

groupby_col = ["col('customer_id')"]
orderby_col = ["col('process_date').desc()", "col('load_date').desc()"]

window_spec = Window.partitionBy(*groupby_col).orderBy([eval(x) for x in orderby_col])

df = df.withColumn("rank", rank().over(window_spec))
df = df.filter(col('rank') == '1')

My concern, is I'm using the orderby_col and evaluating to covert in columner way using eval() and for loop to check all the orderby columns in the list. Could you please let me know how we can pass multiple columns in order by without having a for loop to do the descending order??

Another way of passing orderby list in pyspark windows method

Answers (1)

Related Questions