Micah Pearce
Micah Pearce

Reputation: 1935

Spark: how to make value of new column based on different columns

Spark 2.2.1 Pyspark

df = sqlContext.createDataFrame([
    ("dog", "1", "2", "3"),
    ("cat", "4", "5", "6"),
    ("dog", "7", "8", "9"),
    ("cat", "10", "11", "12"),
    ("dog", "13", "14", "15"),
    ("parrot", "16", "17", "18"),
    ("goldfish", "19", "20", "21"),
], ["pet", "dog_30", "cat_30", "parrot_30"])

And then I have list of the fields that I care above from the "Pet" column

dfvalues = ["dog", "cat", "parrot"]

I want to write code taht will give me the value from dog_30, cat_30 or parrot_30 that corresponds to the value in "pet". For example, in the first row the value for the pet column is dog and so we take the value for dog_30 which is 1.

I tried using this to get the code, but it just gives me nulls for the column stats. I also haven't figured out how to handle the goldfish case. I want to set that to 0.

mycols = [F.when(F.col("pet") == p + "_30", p) for p in dfvalues]
df = df.withColumn("newCol2",F.coalesce(*stats) )
df.show()

Desired output:

+--------+------+------+---------+------+
|     pet|dog_30|cat_30|parrot_30|stats |
+--------+------+------+---------+------+
|     dog|     1|     2|        3|  1   |
|     cat|     4|     5|        6|  5   |
|     dog|     7|     8|        9|  7   |
|     cat|    10|    11|       12|  11  |
|     dog|    13|    14|       15|  13  |
|  parrot|    16|    17|       18|  18  |
|goldfish|    19|    20|       21|  0   |
+--------+------+------+---------+------+

Upvotes: 0

Views: 101

Answers (1)

akuiper
akuiper

Reputation: 214927

The logic is off; you need .when(F.col("pet") == p, F.col(p + '_30')):

mycols = [F.when(F.col("pet") == p, F.col(p + '_30')) for p in dfvalues]
df = df.withColumn("newCol2",F.coalesce(F.coalesce(*mycols),F.lit(0)))
df.show()
+--------+------+------+---------+-------+
|     pet|dog_30|cat_30|parrot_30|newCol2|
+--------+------+------+---------+-------+
|     dog|     1|     2|        3|      1|
|     cat|     4|     5|        6|      5|
|     dog|     7|     8|        9|      7|
|     cat|    10|    11|       12|     11|
|     dog|    13|    14|       15|     13|
|  parrot|    16|    17|       18|     18|
|goldfish|    19|    20|       21|      0|
+--------+------+------+---------+-------+

Upvotes: 3

Related Questions