Reputation: 39
I am not able to convert timestamps in UTC to AKST timezone in Spark 3.0. The same works in Spark 2.4. All other conversions work (to EST, PST, MST etc).
Appreciate any inputs on how to fix this error.
Below command:
spark.sql("select from_utc_timestamp('2020-10-01 11:12:30', 'AKST')").show
returns the error:
java.time.zone.ZoneRulesException: Unknown time-zone ID: AKST
Detailed log:
java.time.zone.ZoneRulesException: Unknown time-zone ID: AKST
at java.time.zone.ZoneRulesProvider.getProvider(ZoneRulesProvider.java:272)
at java.time.zone.ZoneRulesProvider.getRules(ZoneRulesProvider.java:227)
at java.time.ZoneRegion.ofId(ZoneRegion.java:120)
at java.time.ZoneId.of(ZoneId.java:411)
at java.time.ZoneId.of(ZoneId.java:359)
at java.time.ZoneId.of(ZoneId.java:315)
at org.apache.spark.sql.catalyst.util.DateTimeUtils$.getZoneId(DateTimeUtils.scala:62)
at org.apache.spark.sql.catalyst.util.DateTimeUtils$.fromUTCTime(DateTimeUtils.scala:833)
at org.apache.spark.sql.catalyst.expressions.FromUTCTimestamp.nullSafeEval(datetimeExpressions.scala:1299)
at org.apache.spark.sql.catalyst.expressions.BinaryExpression.eval(Expression.scala:552)
at org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:457)
at org.apache.spark.sql.catalyst.optimizer.ConstantFolding$$anonfun$apply$1$$anonfun$applyOrElse$1.applyOrElse(expressions.scala:52)
at org.apache.spark.sql.catalyst.optimizer.ConstantFolding$$anonfun$apply$1$$anonfun$applyOrElse$1.applyOrElse(expressions.scala:45)
at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDown$1(TreeNode.scala:321)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:72)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:321)
at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDown$3(TreeNode.scala:326)
at org.apache.spark.sql.catalyst.trees.TreeNode.applyFunctionIfChanged$1(TreeNode.scala:380)
at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$mapChildren$1(TreeNode.scala:416)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:248)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:414)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:362)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:326)
at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$transformExpressionsDown$1(QueryPlan.scala:96)
at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$1(QueryPlan.scala:118)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:72)
at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpression$1(QueryPlan.scala:118)
at org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:129)
at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$3(QueryPlan.scala:134)
at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
at scala.collection.immutable.List.foreach(List.scala:392)
at scala.collection.TraversableLike.map(TraversableLike.scala:238)
at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
at scala.collection.immutable.List.map(List.scala:298)
at org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:134)
at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$4(QueryPlan.scala:139)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:248)
at org.apache.spark.sql.catalyst.plans.QueryPlan.mapExpressions(QueryPlan.scala:139)
at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsDown(QueryPlan.scala:96)
at org.apache.spark.sql.catalyst.optimizer.ConstantFolding$$anonfun$apply$1.applyOrElse(expressions.scala:45)
at org.apache.spark.sql.catalyst.optimizer.ConstantFolding$$anonfun$apply$1.applyOrElse(expressions.scala:44)
at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDown$1(TreeNode.scala:321)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:72)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:321)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDown(LogicalPlan.scala:29)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDown(AnalysisHelper.scala:149)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDown$(AnalysisHelper.scala:147)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29)
at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDown$3(TreeNode.scala:326)
at org.apache.spark.sql.catalyst.trees.TreeNode.applyFunctionIfChanged$1(TreeNode.scala:380)
at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$mapChildren$1(TreeNode.scala:416)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:248)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:414)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:362)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:326)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDown(LogicalPlan.scala:29)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDown(AnalysisHelper.scala:149)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDown$(AnalysisHelper.scala:147)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29)
at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDown$3(TreeNode.scala:326)
at org.apache.spark.sql.catalyst.trees.TreeNode.applyFunctionIfChanged$1(TreeNode.scala:380)
at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$mapChildren$1(TreeNode.scala:416)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:248)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:414)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:362)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:326)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDown(LogicalPlan.scala:29)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDown(AnalysisHelper.scala:149)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDown$(AnalysisHelper.scala:147)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29)
at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:310)
at org.apache.spark.sql.catalyst.optimizer.ConstantFolding$.apply(expressions.scala:44)
at org.apache.spark.sql.catalyst.optimizer.ConstantFolding$.apply(expressions.scala:43)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:149)
at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
at scala.collection.immutable.List.foldLeft(List.scala:89)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:146)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:138)
at scala.collection.immutable.List.foreach(List.scala:392)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:138)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:116)
at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:98)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:116)
at org.apache.spark.sql.execution.QueryExecution.$anonfun$optimizedPlan$1(QueryExecution.scala:82)
at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:121)
at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:153)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:763)
at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:153)
at org.apache.spark.sql.execution.QueryExecution.optimizedPlan$lzycompute(QueryExecution.scala:82)
at org.apache.spark.sql.execution.QueryExecution.optimizedPlan(QueryExecution.scala:79)
at org.apache.spark.sql.execution.QueryExecution.$anonfun$writePlans$4(QueryExecution.scala:217)
at org.apache.spark.sql.catalyst.plans.QueryPlan$.append(QueryPlan.scala:381)
at org.apache.spark.sql.execution.QueryExecution.org$apache$spark$sql$execution$QueryExecution$$writePlans(QueryExecution.scala:217)
at org.apache.spark.sql.execution.QueryExecution.toString(QueryExecution.scala:227)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:96)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:207)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:88)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:763)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3653)
at org.apache.spark.sql.Dataset.head(Dataset.scala:2737)
at org.apache.spark.sql.Dataset.take(Dataset.scala:2944)
at org.apache.spark.sql.Dataset.getRows(Dataset.scala:301)
at org.apache.spark.sql.Dataset.showString(Dataset.scala:338)
at org.apache.spark.sql.Dataset.show(Dataset.scala:864)
at org.apache.spark.sql.Dataset.show(Dataset.scala:823)
at org.apache.spark.sql.Dataset.show(Dataset.scala:832)
... 47 elided
Upvotes: 2
Views: 1408
Reputation: 32650
Adding further to mck's answer. You are using the old Java data-time API zone's short IDs. According to this Databricks blog post A Comprehensive Look at Dates and Timestamps in Apache Spark™ 3.0, Spark migrated to the new API since version 3.0:
Since Java 8, the JDK has exposed a new API for date-time manipulation and time zone offset resolution, and Spark migrated to this new API in version 3.0. Although the mapping of time zone names to offsets has the same source, IANA TZDB, it is implemented differently in Java 8 and higher versus Java 7.
You can verify it by opening spark-shell and list available zones like this:
import java.time.ZoneId
import scala.collection.JavaConverters._
ZoneId.SHORT_IDS.asScala.keys
//res0: Iterable[String] = Set(CTT, ART, CNT, PRT, PNT, PLT, AST, BST, CST, EST, HST, JST, IST, AGT, NST, MST, AET, BET, PST, ACT, SST, VST, CAT, ECT, EAT, IET, MIT, NET)
That said, you should not use abbreviations when you specify timezones, instead use area/city
format. See Which three-letter time zone IDs are not deprecated?
Upvotes: 2
Reputation: 42342
Seems it can't understand AKST
, but Spark 3 seems to understand America/Anchorage
, which I suppose to have the timezone AKST
:
spark.sql("select from_utc_timestamp('2020-10-01 11:12:30', 'America/Anchorage')").show
Upvotes: 1