abc_spark
abc_spark

Reputation: 383

How to create a Spark DF as below

I need to create a Scala Spark DF as below. This question may be silly but need to know what is the best approach to create small structures for testing purpose

  1. For creating a minimal DF.
  2. For creating a minimal RDD.

I've tried the following code so far without success :

val rdd2 = sc.parallelize(Seq("7","8","9"))

and then creating to DF by

val dfSchema = Seq("col1", "col2", "col3") 

and

 rdd2.toDF(dfSchema: _*)

Here's a sample Dataframe I'd like to obtain :

c1  c2  c3
1   2   3
4   5   6

Upvotes: 0

Views: 163

Answers (2)

Nikhil Suthar
Nikhil Suthar

Reputation: 2431

You are missing one "()" in Seq. Use it as below:

scala> val df = sc.parallelize(Seq(("7","8","9"))).toDF("col1", "col2", "col3")

scala> df.show
+----+----+----+
|col1|col2|col3|
+----+----+----+
|   7|   8|   9|
+----+----+----+

Upvotes: 1

baitmbarek
baitmbarek

Reputation: 2518

abc_spark, here's a sample you can use to easily create Dataframes and RDDs for testing :

import spark.implicits._

val df = Seq(
      (1, 2, 3),
      (4, 5, 6)
    ).toDF("c1", "c2", "c3")

df.show(false)

+---+---+---+
|c1 |c2 |c3 |
+---+---+---+
|1  |2  |3  |
|4  |5  |6  |
+---+---+---+

val rdd: RDD[Row] = df.rdd

rdd.map{_.getAs[Int]("c2")}.foreach{println}

Gives

5
2

Upvotes: 1

Related Questions