Reputation: 6616
In the SparkSQL documentation, there is a when function that returns a column. The example given is reproduced below:
people.select(when(people("gender") === "male", 0)
.when(people("gender") === "female", 1)
.otherwise(2))
In this example, the result of the when condition is either a 0, 1, or 2. But what if I wanted the result to be a column of the people DataFrame? For example, given the following data:
id | name | gender | testosterone | estrogen
-----------------------------------------------
1 | Joe | male | 10 | 2
2 | Sue | female | 3 | 12
3 | John | male | 9 | 3
4 | Kim | female | 1 | 10
I want something like this:
SELECT
name,
CASE WHEN gender = "male" THEN testosterone
WHEN gender = "female" THEN estrogen
END AS hormone_level
FROM
people
And the result would be:
name | hormone_level
-----------------------
Joe | 10
Sue | 12
John | 9
Kim | 10
Upvotes: 0
Views: 317
Reputation: 31
Just
when(people("gender") === "female", people("estrogen"))
.when(people("gender") === "male", people("testosterone"))
// .otherwise(???) Add base-case if required
Upvotes: 3