Bea
Bea

Reputation: 11

Moving Average Conditioned by another columns' values

dataset

This is my data. The column expected sales computes the moving average of current row and the two rows before it. Then if the sales are 0 or below 0.5 * MA, it's classified as stockout. Here's my code:

rolling_window1 = Window.orderBy('formatted_timestamp').rowsBetween(-2, 0)

result_df = result_df.withColumn('3_avg_sales', avg('hour_sales').over(rolling_window1))

result_df = result_df.withColumn('stockout_0.5', when((col('hour_sales') == 0) | (col('hour_sales') < 0.5 * col('avg_sales')), lit(1)).otherwise(lit(0)))

Now I want to compute the MA with the same window size, but not with counting rows that were classified as stockouts. If one of the 2 previous rows were classified as stockout, then I want to "increase window size", so that I still have 2 observations (not stockouts).

Upvotes: 1

Views: 26

Answers (0)

Related Questions