Sakshi Chourasia
Sakshi Chourasia

Reputation: 31

Create table inSspark SQL doesn't support not null

I ran the query

CREATE TABLE IF NOT EXISTS OMIDimensionshql.DimPrimaryProduct (PrimaryProductKey int, abc STRING, SCDStartDate timestamp NOT NULL, SCDEndDate timestamp, OMIDQFailedFlag boolean, OMIComputeDeletedFlag boolean NOT NULL, OMIComputeCreatedDate timestamp NOT NULL, OMIComputeModifiedDate timestamp NOT NULL ) Using delta LOCATION 'adl://psinsightsadlsdev01.azuredatalakestore.net//PPE/Compute/OMIDimensions/DimPrimaryProductGrouping/Full/'

Using spark.sql() but it gives below error -

Exception in thread "main" org.apache.spark.sql.catalyst.parser.ParseException: 
no viable alternative at input 'CREATE TABLE IF NOT EXISTS OMIDimensionshql.DimPrimaryProduct (PrimaryProductKey int, abc STRING, SCDStartDate timestamp NOT'(line 1, pos 121)

== SQL ==
CREATE TABLE IF NOT EXISTS OMIDimensionshql.DimPrimaryProduct (PrimaryProductKey int, abc STRING, SCDStartDate timestamp NOT NULL, SCDEndDate timestamp, OMIDQFailedFlag boolean, OMIComputeDeletedFlag boolean NOT NULL, OMIComputeCreatedDate timestamp NOT NULL, OMIComputeModifiedDate timestamp NOT NULL ) Using delta LOCATION 'adl://psinsightsadlsdev01.azuredatalakestore.net//PPE/Compute/OMIDimensions/DimPrimaryProductGrouping/Full/'
-------------------------------------------------------------------------------------------------------------------------^^^

at org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:239)
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:115)
at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:48)
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:69)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:638)
at com.ms.omi.meta.execute.Execute$$anonfun$createSubjectAreaTables$1.apply(Execute.scala:55)
at com.ms.omi.meta.execute.Execute$$anonfun$createSubjectAreaTables$1.apply(Execute.scala:46)
at scala.collection.immutable.List.foreach(List.scala:381)
at com.ms.omi.meta.execute.Execute$.createSubjectAreaTables(Execute.scala:46)
at com.ms.omi.meta.entry.EntOmiMetaStore$.main(EntOmiMetaStore.scala:21)
at com.ms.omi.meta.entry.EntOmiMetaStore.main(EntOmiMetaStore.scala)

Process finished with exit code 1

When I execute the same query in Spark SQL notebook on Databricks cluster it works, it just doesn't work when I execute it locally in Scala using spark.sql().

Upvotes: 3

Views: 3202

Answers (2)

Harsha TJ
Harsha TJ

Reputation: 272

Just two reasons for this :

  1. Delta had flaw regarding the same thing. If you are sharing a existing dedicated cluster which is Databricks 4.0 or any less than 5.0 Beta, you can't do NOT NULL on Columns in your DDLs. If you have availability to 5.0 Beta or the 5.0 Official releases then this things do support now. Databricks guys fixed this in 5.0 Beta and above along with the 10K limit on the MERGE INTO.

  2. For you might wanna do the following

sql("SET spark.databricks.delta.preview.enabled=true")

sql("SET spark.databricks.delta.merge.joinBasedMerge.enabled = true")

Upvotes: 1

user10543282
user10543282

Reputation:

NOT NULL constraints is not supported in standard Spark runtime.

Databricks uses it's own runtime, with larger number of proprietary extensions, so the features which which are present there are not necessarily available in the open source Spark distribution.

In fact, another feature you try to use - Databricks Delta - is a proprietary extension as well.

Upvotes: 2

Related Questions