Thomas Decaux
Thomas Decaux

Reputation: 22661

Get date week different between Spark SQL date_format and weekofyear

A very simple:

SELECT  date_format("2018-01-14", "w"), weekofyear("2018-01-14")

Gives:

3, 2

They should both return 2, how can I configure the locale correcly?

(Environment user.country = fr user.lang = FR)

I can see in Spark weekofyear code source that the week start at Monday.

Upvotes: 1

Views: 423

Answers (1)

stefanobaghino
stefanobaghino

Reputation: 12794

As of version 2.2.1 (but also on the current master branch), date_format is defined in DateFormatClass, which in turn uses DateTimeUtils#newDateFormat, which unfortunately uses an hard-coded Locale.US, leaving you no option to configure its behavior.

def newDateFormat(formatString: String, timeZone: TimeZone): DateFormat = {
  val sdf = new SimpleDateFormat(formatString, Locale.US)
  sdf.setTimeZone(timeZone)
  // Enable strict parsing, if the input date/format is invalid, it will throw an exception.
  // e.g. to parse invalid date '2016-13-12', or '2016-01-12' with  invalid format 'yyyy-aa-dd',
  // an exception will be throwed.
  sdf.setLenient(false)
  sdf
}

So it seems like those two are bound to have different behaviors. Perhaps you may want to have a look at their bug repository and possibly file a ticket for this.

Upvotes: 2

Related Questions