adarsh kadameri
adarsh kadameri

Reputation: 89

How to change the column values of a DataFrame into title case in scala .

Input Dataframe

val ds = Seq((1,"play framework"),
  (2,"spark framework"),
  (3,"spring framework ")).toDF("id","subject")

I am expecting title case on column subject like as follows .

 val ds = Seq((1,"Play Framework"),
  (2,"Spark Framework"),
  (3,"Spring Framework ")).toDF("id","subject")

I could use Use lower function from org.apache.spark.sql.functions

like ds.select($"subject", lower($"subject")).show

to convert into lower case . But how i can make a result as i expected as above ?

Upvotes: 1

Views: 10719

Answers (2)

Manoj Kumar Dhakad
Manoj Kumar Dhakad

Reputation: 1892

You can do like this

val captalizeUDF=udf((str:String)=>str.split(" ").map(word=>word.trim.capitalize).mkString(" "))

ds.select($"id",captalizeUDF($"subject").alias("subject")).show

                     or

ds.select($"id",initcap($"subject").alias("subject")).show

Sample output:

+---+----------------+
| id|         subject|
+---+----------------+
|  1|  Play Framework|
|  2| Spark Framework|
|  3|Spring Framework|
+---+----------------+

Upvotes: 1

Ramesh Maharjan
Ramesh Maharjan

Reputation: 41957

there is a inbuilt function called initcap which does exactly as you require

import org.apache.spark.sql.functions._
ds.withColumn("subject", initcap(col("subject"))).show(false)

the official documentation says it

public static Column initcap(Column e) Returns a new string column by converting the first letter of each word to uppercase. Words are delimited by whitespace.

Upvotes: 7

Related Questions