a.moussa
a.moussa

Reputation: 3277

zipWithIndex rdd with initial value

I have a RDD like this:

+----------+--------+
|firstName |lastName|
+----------+--------+
|      john|   smith|
|      anna|  tourde|
+----------+--------+

I wouldLike to create a new column as we can do with zipWithIndex but giving and initial value of 8.

+----------+--------+-----+
|firstName |lastName|index|
+----------+--------+-----+
|      john|   smith|    8|
|      anna|  tourde|    9|
+----------+--------+-----+

Do you have any idea? Thanks

Upvotes: 1

Views: 5157

Answers (2)

koiralo
koiralo

Reputation: 23099

use zipWithIndex and convert back to dataframe as below

val df1 = spark.sqlContext.createDataFrame(
    df.rdd.zipWithIndex.map {
  case (row, index) => Row.fromSeq(row.toSeq :+ index + 8)
},
// Create schema for index column
StructType(df.schema.fields :+ StructField("index", LongType, false)))

Upvotes: 3

Vitalii Kotliarenko
Vitalii Kotliarenko

Reputation: 2967

rdd.zipWithIndex().map { case (v, ind) =>
  (v, ind + 8)
}

Upvotes: 6

Related Questions