Nico Arbar
Nico Arbar

Reputation: 162

Is there a Count function within the aggregate Window functions in pyspark?

I´m trying to count the number of rows within a window and having that value in a column for each window. To do that, I´m just using the row_number function, and then getting the max number of that row_number column. My question is: Is there a more effective way to achieve this goal, avoiding these two steps? Does the windos functions have a count function?

Here´s my code:

output_df = input_df\
    .withColumn('row_number_window', row_number().over(window))\
    .withColumn('n_rows_count', max('row_number_window').over(window))

Upvotes: 1

Views: 454

Answers (1)

Lamanus
Lamanus

Reputation: 13581

Try this.

count(lit(1)).over(window)

Upvotes: 3

Related Questions