Reputation: 11
This is my data. The column expected sales computes the moving average of current row and the two rows before it. Then if the sales are 0 or below 0.5 * MA, it's classified as stockout. Here's my code:
rolling_window1 = Window.orderBy('formatted_timestamp').rowsBetween(-2, 0)
result_df = result_df.withColumn('3_avg_sales', avg('hour_sales').over(rolling_window1))
result_df = result_df.withColumn('stockout_0.5', when((col('hour_sales') == 0) | (col('hour_sales') < 0.5 * col('avg_sales')), lit(1)).otherwise(lit(0)))
Now I want to compute the MA with the same window size, but not with counting rows that were classified as stockouts. If one of the 2 previous rows were classified as stockout, then I want to "increase window size", so that I still have 2 observations (not stockouts).
Upvotes: 1
Views: 26