Reputation: 807
Similar to this question How to extract schema for avro file in python Is there a way to read in an avro file in golang without knowing the schema beforehand and extract a schema?
Upvotes: 2
Views: 3041
Reputation: 4161
Both https://github.com/hamba/avro and https://github.com/linkedin/goavro can decode Avro OCF files (which it sounds like is what you have) without an explicit schema file.
Once you've created a new reader/decoder, you can retrieve the metadata, which includes the schema at key avro.schema
: https://pkg.go.dev/github.com/hamba/avro/ocf#Decoder.Metadata and https://pkg.go.dev/github.com/linkedin/goavro#OCFReader.MetaData
Upvotes: 0
Reputation: 5426
How about something like this (adapted code from https://github.com/hamba/avro/blob/master/ocf/ocf.go):
package main
import (
"github.com/hamba/avro"
"log"
"os"
)
// HeaderSchema is the Avro schema of a container file header.
var HeaderSchema = avro.MustParse(`{
"type": "record",
"name": "org.apache.avro.file.Header",
"fields": [
{"name": "magic", "type": {"type": "fixed", "name": "Magic", "size": 4}},
{"name": "meta", "type": {"type": "map", "values": "bytes"}},
{"name": "sync", "type": {"type": "fixed", "name": "Sync", "size": 16}}
]
}`)
var magicBytes = [4]byte{'O', 'b', 'j', 1}
const (
schemaKey = "avro.schema"
)
// Header represents an Avro container file header.
type Header struct {
Magic [4]byte `avro:"magic"`
Meta map[string][]byte `avro:"meta"`
Sync [16]byte `avro:"sync"`
}
func main() {
r, err := os.Open("path/my.avro")
if err != nil {
log.Fatal(err)
}
defer r.Close()
reader := avro.NewReader(r, 1024)
var h Header
reader.ReadVal(HeaderSchema, &h)
if reader.Error != nil {
log.Println("decoder: unexpected error: %v", reader.Error)
}
if h.Magic != magicBytes {
log.Println("decoder: invalid avro file")
}
schema, err := avro.Parse(string(h.Meta[schemaKey]))
if err != nil {
log.Println(err)
}
log.Println(schema)
}
Upvotes: 4