StephanieCGraduate
StephanieCGraduate

Reputation: 53

pyspark timestamp with timezone

I am trying to extract out a value from a table using pyspark and I need the value in this format: 2020-06-17T15:08:24z

df = spark.sql('select max(lastModDt)as lastModDate from db.tbl')

jobMetadata = existingMaxModifiedDate.withColumn("maxDate", date_format(to_timestamp(existingMaxModifiedDate.lastModDate, "yyyy-mm-dd HH:MM:SS.SSS"), "yyyy-mm-dd HH:MM:SS.SSS"))

However, I keep getting null for created column "maxDate". Thank you.

Upvotes: 1

Views: 1093

Answers (1)

Som
Som

Reputation: 6338

Perhaps this is useful -

  val timeDF = spark.sql(
      """
        |select current_timestamp() as time1,
        | translate(date_format(current_timestamp(), 'yyyy-MM-dd HH:mm:ssZ') ,' ', 'T') as time2,
        | translate(date_format(current_timestamp(), 'yyyy-MM-dd#HH:mm:ss$') ,'#$', 'Tz') as time3
      """.stripMargin)
    timeDF.show(false)
    timeDF.printSchema()

    /**
      * +-----------------------+------------------------+--------------------+
      * |time1                  |time2                   |time3               |
      * +-----------------------+------------------------+--------------------+
      * |2020-06-30 21:22:04.541|2020-06-30T21:22:04+0530|2020-06-30T21:22:04z|
      * +-----------------------+------------------------+--------------------+
      *
      * root
      * |-- time1: timestamp (nullable = false)
      * |-- time2: string (nullable = false)
      * |-- time3: string (nullable = false) 
      */

Upvotes: 1

Related Questions