coder AJ
coder AJ

Reputation: 1

Apache Spark 2.0 - date_add function

I have a simple schema with a date and an int. I want to use date_add to add the int to the date.

scala> val ds1 = spark.read.option("inferSchema",true).csv("samp.csv")

ds1.printSchema();

root
 |-- _c0: timestamp (nullable = true)
 |-- _c1: integer (nullable = true)

I cannot get the first param to date_add work...please help!

scala> val ds2 = ds1.map ( x => date_add(x.getAs[timestamp]("_c0"),  x.getAs[Int]("_c1")))
<console>:28: error: not found: type timestamp

scala> val ds2 = ds1.map ( x => date_add(x.getAs[Column]("_c0"), x.getAs[Int] ("_c1")))
<console>:28: error: not found: type Column

Upvotes: 0

Views: 2134

Answers (1)

OneCricketeer
OneCricketeer

Reputation: 191728

date_add is not your immediate problem... not found: type {timestamp, Column}

I'm not sure how you expect x.getAs[timestamp] to work, honestly, but for the other, you need an import.

import org.apache.spark.sql.Column
import org.apache.spark.sql.functions.date_add

Now try

val ds2 = ds1.map { x => 
    date_add(ds1("_c0"), x.getAs[Int] ("_c1"))
}

(Though, you should ideally not be using Dataset.map)

Upvotes: 1

Related Questions