Reputation: 188
I have a dataframe as follows.
key | value
inv_1_c | 5
inv_1_v | 8
inv_2_c | 9
I would like to add two columns to the dataframe Voltage and Current.
Voltage
would be value if key ends with "_v"
or 0 otherwise.
Current
would be value if key ends with "_c"
or 0 otherwise.
What would be the scala spark code for this ?
Upvotes: 2
Views: 1054
Reputation: 23099
You can use subString
function to get the last two characters and check if it contains _v
or _c
and add two new columns with withColumn
import org.apache.spark.sql.functions._
val data = Seq(
("inv_1_c", "5"),
("inv_1_v", "8"),
("inv_2_c", "9")
).toDF("key", "value")
data.withColumn("temp", substring($"key", -2, 2))
.withColumn("voltage", when($"temp" === "_v", $"value").otherwise(0))
.withColumn("current", when($"temp" === "_c", $"value").otherwise(0))
.drop("temp")
Output:
+-------+-----+-------+-------+
|key |value|voltage|current|
+-------+-----+-------+-------+
|inv_1_c|5 |0 |5 |
|inv_1_v|8 |8 |0 |
|inv_2_c|9 |0 |9 |
+-------+-----+-------+-------+
Hope this helps!
Upvotes: 3