Cherry
Cherry

Reputation: 33628

How apply regexp_replace spark function for multiple key-values?

Let say that there is a Map with key value pair or text like:

val pairs = Map(
  "x1" -> "a",
  "1" -> "xf",
  "80" -> "AB"
)

Is there a way to add new column with regexp_replace invocation in cycle like that:

df.withColumn("newColumn", pairs.mapSomeHow((k,v) => regexp_replace(col("originalColumn"), k, v)))

E.g. newColumn will have value from originalColumn with "x1", "1", "80" strings replaced.

How to do that?

Upvotes: 0

Views: 224

Answers (1)

Vladimir Matveev
Vladimir Matveev

Reputation: 128131

Something like

df.withColumn(
  "newColumn",
  pairs.foldLeft(df("originalColumn")) {
    case (c, (k, v)) =>
      regexp_replace(c, k, v)
  }
)

This will build a Column instance from the original df("originalColumn") instance by repeatedly applying regexp_replace to the result of the previous replace.

Upvotes: 2

Related Questions