Reputation: 20435
Given a DataFrame
, for instance
val df = sc.parallelize(Seq((1L, 0.1), (2L, 0.2), (3L, 0.3))).toDF("k","v")
df.show
+---+---+
| k| v|
+---+---+
| 1|0.1|
| 2|0.2|
| 3|0.3|
+---+---+
how to sum up each row into a new column, named totals
so that dfTotals.show
+---+---+--------+
| k| v| totals|
+---+---+--------+
| 1|0.1| 1.1|
| 2|0.2| 2.2|
| 3|0.3| 3.3|
+---+---+--------+
Upvotes: 1
Views: 552
Reputation: 20435
Found a solution simpler than originally thought,
val totals = ($"k" + $"v")
val dfTotals = df.withColumn("totals", totals)
and so
dfTotals.show
+---+---+------+
| k| v|totals|
+---+---+------+
| 1|0.1| 1.1|
| 2|0.2| 2.2|
| 3|0.3| 3.3|
+---+---+------+
Update: another approach, not so neat though,
df.select(df("k"), df("v"), df("k")+df("v"))
Upvotes: 1