Reputation: 1174
I need to parse really long json file (more than million items). I don't want to load it to the memory and read it chunk by chunk. There's a good example with the array of items here. The problem is that I deal with the map. And when I call Decode
I get not at beginning of value
.
I can't get what should be changed.
const data = `{
"object1": {"name": "cattle","location": "kitchen"},
"object2": {"name": "table","location": "office"}
}`
type ReadObject struct {
Name string `json:"name"`
Location string `json:"location"`
}
func ParseJSON() {
dec := json.NewDecoder(strings.NewReader(data))
tkn, err := dec.Token()
if err != nil {
log.Fatalf("failed to read opening token: %v", err)
}
fmt.Printf("opening token: %v\n", tkn)
objects := make(map[string]*ReadObject)
for dec.More() {
var nextSymbol string
if err := dec.Decode(&nextSymbol); err != nil {
log.Fatalf("failed to parse next symbol: %v", err)
}
nextObject := &ReadObject{}
if err := dec.Decode(&nextObject); err != nil {
log.Fatalf("failed to parse next object")
}
objects[nextSymbol] = nextObject
}
tkn, err = dec.Token()
if err != nil {
log.Fatalf("failed to read closing token: %v", err)
}
fmt.Printf("closing token: %v\n", tkn)
fmt.Printf("OBJECTS: \n%v\n", objects)
}
Upvotes: 1
Views: 414
Reputation: 51790
After consuming the initial {
with your first call to dec.Token()
, you must :
dec.Token()
to extract the next keydec.Decode(&nextObject)
to decode an entryexample code :
for dec.More() {
key, err := dec.Token()
if err != nil {
// handle error
}
var val interface{}
err = dec.Decode(&val)
if err != nil {
// handle error
}
fmt.Printf(" %s : %v\n", key, val)
}
https://play.golang.org/p/5r1d8MsNlKb
Upvotes: 1
Reputation: 126
TL,DR: when you are calling Token()
method for a first time, you move offset from the beginning (of a JSON value) and therefore you get the error.
You are working with this struct (link):
type Decoder struct {
// others fields omits for simplicity
tokenState int
}
Pay attention for a tokenState field. This value could be one of (link):
const (
tokenTopValue = iota
tokenArrayStart
tokenArrayValue
tokenArrayComma
tokenObjectStart
tokenObjectKey
tokenObjectColon
tokenObjectValue
tokenObjectComma
)
Let's back to your code. You are calling Token()
method. This method obtains first JSON-valid token {
and changes tokenState from tokenObjectValue to the tokenObjectStart (link). Now you are "in-an-object" state.
If you try to call Decode()
at this point you will get an error (not at beginning of value). This is because allowed states of tokenState for calling Decode()
are tokenTopValue, tokenArrayStart, tokenArrayValue, tokenObjectValue, i.e. "full" value, not part of it (link).
To avoid this you can just don't call Token()
at all and do something like this:
dec := json.NewDecoder(strings.NewReader(dataMapFromJson))
objects := make(map[string]*ReadObject)
if err := dec.Decode(&objects); err != nil {
log.Fatalf("failed to parse next symbol: %v", err)
}
fmt.Printf("OBJECTS: \n%v\n", objects)
Or, if you want to read chunk-by-chunk, you could keep calling Token()
until you reach "full" value. And then call Decode()
on this value (I guess this should work).
Upvotes: 2