Reputation: 133
Im using spark/scala locally to transform json files into a dataframe.
My current dataframe has a column with 'Male' and 'Female' values, shown below. I want to change where you see 'Male' in the dataframe to 'M' and likewise for 'Female' to 'F' using spark -sql.
So far I have:
val results = spark.sql("SELECT name, case WHEN gender = 'Male' then 'M' WHEN gender = 'Female' then 'F' else 'Unknown' END from ocupation_table)
but it's creating a separate column and I want it to rename the values in the existing 'gender' column.
Upvotes: 1
Views: 73
Reputation: 2610
You can use Spark's withColumn(...)
method to achieve this. It will replace a named column if it already exists. Something like this should do the trick:
import org.apache.spark.sql.functions
val results = df.withColumn("gender", substring(df("gender"), 0, 1))
Upvotes: 0