Rao
Rao

Reputation: 153

select array of columns and expr from dataframe spark scala

Can we select list of columns and along with expr from a dataframe ?

I need to select list of columns and expr from a dataframe.

Below is list of columns

val dynamicColumnSelection = Array("a", "b", "c", "d", "e", "f")
// These columns will change dynamically.

And also I have a expr to select from the same dataframe along with the above columns.

expr("stack(3, 'g', g, 'h', h, 'i', i) as (Key,Value)") 

I am able to select either array of columns or individual columns along with expr.

df.select(col("a"), col("b"), col("c"), col("d"), col("e"),
          expr("stack(3, 'g', g, 'h', h, 'i', i) as (Key,Value)") )

But here dynamicColumnSelection columns prepared dynamically.

Can we select a list of columns and along with expr from a dataframe ?
Please guide, how can I achieve this?

The dataframe is huge, so not looking for join.

Upvotes: 0

Views: 2175

Answers (1)

Oli
Oli

Reputation: 10406

What you can do is transform your Array of column names to an array of columns, add the expression to it and use :_* to "splat" the resulting array.

// simply creating a one line dataframe to check that it's working
val df = Seq((1, 2, 3, 4, 5 ,6, 7, 8, 9))
    .toDF("a", "b", "c", "d", "e", "f", "g", "h", "i")
val e = expr("stack(3, 'g', g, 'h', h, 'i', i) as (Key,Value)")
val dynamicColumnSelection = Array("a", "b", "c", "d", "e", "f")
val result = df.select(dynamicColumnSelection.map(col) :+ e :_*)
result.show()

Which yields

+---+---+---+---+---+---+---+-----+
|  a|  b|  c|  d|  e|  f|Key|Value|
+---+---+---+---+---+---+---+-----+
|  1|  2|  3|  4|  5|  6|  g|    7|
|  1|  2|  3|  4|  5|  6|  h|    8|
|  1|  2|  3|  4|  5|  6|  i|    9|
+---+---+---+---+---+---+---+-----+

Upvotes: 3

Related Questions