Reputation: 551
I have a Dataframe with two columns. example :
Col1 | Col2
001 | This is the first string
002 | This is the second string.
I want to do an operation which converts the Dataframe column Col2 into thee following format -
Col1 | Col2
001 | ["This", "is", "the", "first", "string" ]
002 | ["This", "is", "the", "second", "string" ]
Is there a built in functions that can help me achieve this?
Upvotes: 2
Views: 2062
Reputation: 4069
Just run split
function
import pyspark.sql.functions as f
df = df.withColumn('Col2', f.split('Col2', ' '))
Upvotes: 1