itscarlayall
itscarlayall

Reputation: 166

Scala Spark - Select columns by name and list

I'm trying to select columns from a Scala Spark DataFrame using both single column names and names extracted from a List. My current solutions looks like:

var cols_list = List("d", "e")

df
.select(
    col("a"),
    col("b"),
    col("c"),
    cols_list.map(col): _*)

However, it throws an error:

<console>:81: error: no `: _*' annotation allowed here
(such annotations are only allowed in arguments to *-parameters)
               cols_list.map(col): _*
                                        ^

Any help will be appreciated

Upvotes: 0

Views: 2760

Answers (2)

mck
mck

Reputation: 42332

select accepts a List[Column], so you need to construct and provide that list, e.g.

df.select(col("a") :: col("b") :: col("c") :: cols_list.map(col): _*)

Upvotes: 3

SCouto
SCouto

Reputation: 7928

Your code is working fine for me, you can also use the $ notation.

scala> df.select(cols_list.map(col):_*)
res8: org.apache.spark.sql.DataFrame = [d: int, e: int]

scala> df.select(cols_list.map(c => $"$c"):_*)
res9: org.apache.spark.sql.DataFrame = [d: int, e: int]

Maybe you just need to import spark.implicits._

EXTRA: Also check your variable names, it's a naming convention in scala to use camel case and try to avoid var (this is just a good practise issue, not related to your error at all)

val colsList = List("d", "e")

Upvotes: 0

Related Questions