Reputation: 110
I'm trying to lowerCase the first letter of column values.
I can't find a way to lower only the first letter using built-in functions, I know there's initCap
for capitalizing the data but I'm trying to decapitalize.
I tried using substring but looks a bit overkill and didn't work.
val data = spark.sparkContext.parallelize(Seq(("Spark"),("SparkHello"),("Spark Hello"))).toDF("name")
data.withColumn("name",lower(substring($"name",1,1)) + substring($"name",2,?))
I know I can create a custom UDF but I thought there's may be a built-in solution for this.
Upvotes: 0
Views: 706
Reputation: 42352
You can use the Spark SQL substring
method, which allows neglecting the length argument (and will get the string until the end):
data.withColumn("name", concat(lower(substring($"name",1,1)), expr("substring(name,2)"))).show
+-----------+
| name|
+-----------+
| spark|
| sparkHello|
|spark Hello|
+-----------+
Note that you cannot +
strings. You need to use concat
.
Upvotes: 1