Explorer
Explorer

Reputation: 1647

Generate schema less avro using Spark

Is there a way to generate schema less avro from Apache spark? I can see a way to generate it through Java/Scala using apache avro library and through confluent avro. When I write Avro from Spark in below way, it creates Avro's with schema. I want to create without schema to reduce the size of final dataset.

df.write.format("avro").save("person.avro")

Upvotes: 0

Views: 658

Answers (1)

Ged
Ged

Reputation: 18003

You need not worry. And you cannot obviate the approach.

AVRO has the data and the schema, always.

AVRO is different to JSON which stores the schema per record that resides in the data itself.

With AVRO the schema is stored once per file. So there is little overhead to consider.

Upvotes: 2

Related Questions