HHH
HHH

Reputation: 6465

how to find week difference between two dates

I have a dataframe which has two column dates in unixtime and I want to find the week difference between these two columns. There is a weekOfYear UDF in SparkSQL but that is only useful when both dates fall in the same year. How can I find the week difference then?

p.s. I'm using Scala Spark.

Upvotes: 0

Views: 1174

Answers (2)

Mohammed Rafi
Mohammed Rafi

Reputation: 88

As you have UNIXTIME date format we can do this expression.

((date1-date2)/(60*60*24*7)).toInt

Edit: Updating this answer with example

   spark.udf.register("weekdiff", (from: Long, to: Long) => ((from - to) / (604800)).toInt)
   //  60*60*24*7 => 604800
   df.withColumn("weekdiff", weekdiff(df("date1_col_name"), df("date2_col_name")))

Upvotes: 1

Shivansh
Shivansh

Reputation: 3544

You can take the approach of creating a custom UDF for this:

scala> val df=sc.parallelize(Seq((1480401142453L,1480399932853L))).toDF("date1","date2")
df: org.apache.spark.sql.DataFrame = [date1: bigint, date2: bigint]

scala> df.show
+-------------+-------------+
|        date1|        date2|
+-------------+-------------+
|1480401142453|1480399932853|
+-------------+-------------+


scala> val udfDateDifference=udf((date1:Long,date2:Long)=>((date1-date2)/(60*60*24*7)).toInt
     | 
     | )
udfDateDifference: org.apache.spark.sql.expressions.UserDefinedFunction = UserDefinedFunction(<function2>,IntegerType,Some(List(LongType, LongType)))

scala> val resultDF=df.withColumn("dateDiffernece",udfDateDifference(df("date1"),df("date2")))
resultDF: org.apache.spark.sql.DataFrame = [date1: bigint, date2: bigint ... 1 more field]

scala> resultDF.show
+-------------+-------------+--------------+
|        date1|        date2|dateDiffernece|
+-------------+-------------+--------------+
|1480401142453|1480399932853|             2|
+-------------+-------------+--------------+

And hence you can get the difference !

Upvotes: 1

Related Questions