fred271828
fred271828

Reputation: 968

How to query the column names of a Spark Dataset?

I have a val ds: Dataset[Double] (in Spark 2.0.0), but what is the name of the double-valued column that can be passed to apply or col to convert from this 1-columned Dataset to a Column.

Upvotes: 9

Views: 36441

Answers (2)

Alberto Bonsanto
Alberto Bonsanto

Reputation: 18022

You could also use DataFrame's method columns, which returns all columns as an Array of Strings.

case class Person(age: Int, height: Int, weight: Int){
  def sum = age + height + weight
}

val df = sc.parallelize(List(Person(1,2,3), Person(4,5,6))).toDF("age", "height", "weight")

df.columns
//res0: Array[String] = Array(age, height, weight)

Upvotes: 10

fred271828
fred271828

Reputation: 968

The column name is "value" as in ds.col("value"). Dataset.schema contains this information: ds.schema.fields.foreach(x => println(x))

Upvotes: 12

Related Questions