Reputation: 162
I´m trying to count the number of rows within a window and having that value in a column for each window. To do that, I´m just using the row_number function, and then getting the max number of that row_number column. My question is: Is there a more effective way to achieve this goal, avoiding these two steps? Does the windos functions have a count function?
Here´s my code:
output_df = input_df\
.withColumn('row_number_window', row_number().over(window))\
.withColumn('n_rows_count', max('row_number_window').over(window))
Upvotes: 1
Views: 454