Reputation: 93
I am trying to include the spark-avro package while starting spark-shell, as per the instructions mentioned here: https://github.com/databricks/spark-avro#with-spark-shell-or-spark-submit.
spark-shell --packages com.databricks:spark-avro_2.10:2.0.1
My intent is to convert the avro schema to spark schema type, using SchemaConverter class present in the package.
import com.databricks.spark.avro._ ... //colListDel is list of fields from avsc which are to be delted for some functional reason.
for( field <- colListDel){
println(SchemaConverters.toSqlType(field.schema()).dataType);
}
...
On execution of above for loop, i get below error:
<console>:47: error: object SchemaConverters in package avro cannot be accessed in package com.databricks.spark.avro
println(SchemaConverters.toSqlType(field.schema()).dataType);
Please suggest if there is anything I am missing or let me know how to include SchemaConverter in my scala code.
Below are my envt details: Spark version: 1.6.0 Cloudera VM 5.7
Thanks!
Upvotes: 4
Views: 2634
Reputation: 1596
This object and the mentioned method used to be private. Please check the source code from version 1.0:
private object SchemaConverters {
case class SchemaType(dataType: DataType, nullable: Boolean)
/**
* This function takes an avro schema and returns a sql schema.
*/
private[avro] def toSqlType(avroSchema: Schema): SchemaType = {
avroSchema.getType match {
...
You were downloading the 2.0.1 version which was probably not build from latest 2.0 branch. I checked the 3.0 version and this class and method are public now.
This should solve your problems:
spark-shell --packages com.databricks:spark-avro_2.10:3.0.0
EDIT: added after comment
The spark-avro 3.0.0 library requires Spark 2.0, so you can replace your current Spark with 2.0 version. The other option would be to contact databricks and ask them to build 2.0.2 version - from the latest 2.0 branch.
Upvotes: 1