sks
sks

Reputation: 169

How to replace specific columns multiple value in Spark Dataframe?

I am trying to replace or update some specific column value in dataframe, as we know Dataframe is immutable, I am trying to transform in to new dataframe instead of Update or Replacement.

I tried dataframe.replace as explained in Spark doc, but it's giving me error as error: value replace is not a member of org.apache.spark.sql.DataFrame

I tried below option.For passing multiple value I am passing in array

val new_df= df.replace("Stringcolumn", Map(array("11","17","18","10"->"12")))

but I am getting error as

error: overloaded method value array with alternatives

Help is really appreciated!!

Upvotes: 0

Views: 1698

Answers (1)

Swadhin Shahriar
Swadhin Shahriar

Reputation: 46

To access org.apache.spark.sql.DataFrameNaFunctions such as replace you have to call .na. So your code should look something like this,

import com.google.common.collect.ImmutableMap

df.na.replace("Stringcolumn", Map(10 -> 12, 11 -> 17))

see here to get all the list of DataFrameNaFunctions and how to use them

Upvotes: 1

Related Questions