paulo
paulo

Reputation: 59

Create Column Based On Aggregation of Other Columns - Pyspark

I want to create a column whose values are equal to another column's when certain conditions are met. I want the column first to have the value of the column share when the columns gender, week and type are the same. I have the following dataframe:

+------+----+----+-------------+-------------------+
|gender|week|type|        share|              units|
+------+----+----+-------------+-------------------+
|  Male|  37|Polo|         0.01|             1809.0| 
|  Male|  37|Polo|          0.1|             2327.0|
|  Male|  37|Polo|         0.15|             2982.0|
|  Male|  37|Polo|          0.2|             3558.0|
|  Male|  38|Polo|         0.01|             1700.0|
|  Male|  38|Polo|          0.1|             2245.0|
|  Male|  38|Polo|         0.15|             2900.0|
|  Male|  38|Polo|          0.2|             3477.0|  

I want the output to be:

+------+----+----+-------------+-------------------+---------+
|gender|week|type|        share|              units|    first|
+------+----+----+-------------+-------------------+---------+
|  Male|  37|Polo|         0.01|             1809.0|   1809.0|
|  Male|  37|Polo|          0.1|             2327.0|   1809.0|
|  Male|  37|Polo|         0.15|             2982.0|   1809.0|
|  Male|  37|Polo|          0.2|             3558.0|   1809.0|
|  Male|  38|Polo|         0.01|             1700.0|   1700.0|
|  Male|  38|Polo|          0.1|             2245.0|   1700.0|
|  Male|  38|Polo|         0.15|             2900.0|   1700.0|
|  Male|  38|Polo|          0.2|             3477.0|   1700.0|

How can I implement this?

Upvotes: 0

Views: 240

Answers (1)

paulo
paulo

Reputation: 59

I found the answer out so I will be posting it here. I used a window function:

m_window = Window.partitionBy(["gender","week","type"]).orderBy("share")

Then I create a column using the function first and over window like this:

df.withColumn("first", first("units").over(m_window))

Upvotes: 1

Related Questions