naimdjon
naimdjon

Reputation: 3602

Expressing spark `StructType` in avro schema

How would you describe spark StructType data type in an avro schema? I am generating a parquet file, the format of which is described in an avro schema. This file is then loaded from S3 into spark. There is an array and map data types but these do not correspond to the StructType.

Upvotes: 0

Views: 1128

Answers (1)

mtapia
mtapia

Reputation: 11

Using the package org.apache.spark.sql.avro (Spark 2.4) you can convert sparkSQL schemas to avro schemas and viceversa.

You can try this way:

import org.apache.spark.sql.avro.SchemaConverters
val sqlType = SchemaConverters.toSqlType(avroSchema)
var rowRDD = yourGeneircRecordRDD.map(record => genericRecordToRow(record, sqlType))
val df = sqlContext.createDataFrame(rowRDD, sqlType.dataType.asInstanceOf[StructType])

Here you can find more answers too: Code

Upvotes: 1

Related Questions