Reputation: 3782
I use Spark 2.2.0 and Scala 2.11. I want to calculate rank
as sold
divided by maximum sold value within the same type
(i.e. the same as actual row's value). But I do not know how to consider type
when calculating max
.
This is my current code. It calculates sold
as the difference between the maximum and minimum stock for the given period of time. The value sold
means how many products were sold in this period of time.
val sales = df.select($"product_pk",$"type",$"stock").groupBy($"type",$"product_pk").agg((max($"stock")-min($"stock")) as "sold")
val ranks = sales.withColumn("rank",$"sold"/max($"sold"))
Upvotes: 1
Views: 663
Reputation: 41957
Here's what you can do, if I understood your question correctly
import org.apache.spark.sql.expressions._
val windowSpec = Window.partitionBy("type")
val ranks = sales.withColumn("rank",$"sold"/(max($"sold").over(windowSpec)))
I hope the answer is helpful
Upvotes: 2