Reactormonk
Reactormonk

Reputation: 21730

What's the proper way to map over a single column in a DataFrame?

Usually I do something like

val fun = udf { x => ... }
df.withColumn("new", fun(df.col("old"))).drop("old").withColumnRename("new", "old")

is there a shorter way?

Upvotes: 2

Views: 1321

Answers (1)

eliasah
eliasah

Reputation: 40380

I usually do the following :

val df : DataFrame = ???
val fun = udf { x => ... }
df.withColumn("old", fun(df.col("old")))

But you'll loose the information from the old column, so be careful on not loosing valuable date.

PS: Of course, a column is accessible in different ways in Spark. So I let you decide on which to use.

Upvotes: 4

Related Questions