data_person
data_person

Reputation: 4480

Ambiguous schema in Spark Scala

Schema:

|-- c0: string (nullable = true)
|-- c1: struct (nullable = true)
|    |-- c2: array (nullable = true)
|    |    |-- element: struct (containsNull = true)
|    |    |    |-- orangeID: string (nullable = true)
|    |    |    |-- orangeId: string (nullable = true)

I am trying to flatten the schema above in spark.

Code:

var df = data.select($"c0",$"c1.*").select($"c0",explode($"c2")).select($"c0",$"col.orangeID", $"col.orangeId")

The flattening code is working fine. The problem is in the last part where the 2 columns differ only by 1 letter (orangeID and orangeId). Hence I am getting this error:

Error:

org.apache.spark.sql.AnalysisException: Ambiguous reference to fields StructField(orangeID,StringType,true), StructField(orangeId,StringType,true);

Any suggestions to avoid this ambiguity will be great.

Upvotes: 9

Views: 3920

Answers (1)

Chandan Ray
Chandan Ray

Reputation: 2091

turn on the spark sql case sensitivity configuration and try

spark.sql("set spark.sql.caseSensitive=true")

Upvotes: 16

Related Questions