Reputation: 2981
I am looking to replace all the values of a column in a spark dataframe with a particular value. I am using pyspark. I tried something like -
new_df = df.withColumn('column_name',10)
Here I want to replace all the values in the column column_name
to 10
. In pandas this could be done by
df['column_name']=10
. I am unable to figure out how to do the same in Spark.
Upvotes: 7
Views: 10837
Reputation: 3129
It might be easier to use lit
as follows:
from pyspark.sql.functions import lit
new_df = df.withColumn('column_name', lit(10))
Upvotes: 7
Reputation: 18022
You can use a UDF to replace the value. However you can use currying to bring support to different values.
from pyspark.sql.functions import udf, col
def replacerUDF(value):
return udf(lambda x: value)
new_df = df.withColumnRenamed("newCol", replacerUDF(10)(col("column_name")))
Upvotes: 2