Chique_Code
Chique_Code

Reputation: 1530

Convert column to lowercase with PySpark

I want to convert all the values to lower cases in the column "Channel". I have df that I created with PySpark in jupyter notebook. I have tried the code from here but got an error. So it is not a duplicate.

My data looks like this:

id     Channel     Brand
123    Hair        Fashion
124    Nails       Fashion 

And I want it to be the following:

id     Channel     Brand
123    hair        Fashion
124    nails       Fashion 

I have tried the following:

new_df = df.select(lower(df.Channel)).alias('Channel')

Which converts the values to lower cases but I am losing my other columns.

Upvotes: 0

Views: 4155

Answers (1)

YOLO
YOLO

Reputation: 21719

You simply can do:

new_df = df.withColumn('Channel', lower(df.Channel))

This will preserve other columns as well.

Upvotes: 4

Related Questions