sparkscala
sparkscala

Reputation: 71

Pass list of column values to spark dataframe as new column

I am trying to add a new column to spark dataframe as below:

val abc = [a,b,c,d]   ---  List of columns

I am trying to pass above list of column values as new column to dataframe and trying to do sha2 on that new column and trying to do a varchar(64).

source = source.withColumn("newcolumn", sha2(col(abc), 256).cast('varchar(64)'))

It complied and the runtime error I am getting as:

Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve 'abc' given input 
columns:

The expected output should be a dataframe with newcolum as column name and the column value as varchar64 with sha2 of concatenate of Array of string with ||.

Please suggest.

Upvotes: 0

Views: 1306

Answers (1)

notNull
notNull

Reputation: 31490

We can use map and concat_ws || to create new column and apply sha2() on the concat data.

val abc = Seq("a","b","c","d")
val df=Seq(((1),(2),(3),(4))).toDF("a","b","c","d")
df.withColumn("newColumn",sha2(concat_ws("||",  abc.map(c=> col(c)):_*),256)).show(false)
//+---+---+---+---+----------------------------------------------------------------+
//|a  |b  |c  |d  |newColumn                                                       |
//+---+---+---+---+----------------------------------------------------------------+
//|1  |2  |3  |4  |20a5b7415fb63243c5dbacc9b30375de49636051bda91859e392d3c6785557c9|
//+---+---+---+---+----------------------------------------------------------------+

Upvotes: 1

Related Questions