hendry
hendry

Reputation: 10843

Debugging a JSON error from Golang

I'm fetching and decoding a large JSON response that has an error in it. Now I need to find where the error is! I read about json.SyntaxError but I am struggling to find out how to use it.

package main

import (
    "encoding/json"
    "fmt"
    "net/http"
    "os"
    "text/template"
    "time"
)

type Movie struct {
    Title       string    `json:"title"`
    PublishedAt time.Time `json:"published_at"`
}

func main() {
    req, _ := http.NewRequest("GET", "https://s.natalian.org/2016-12-07/debugme2.json", nil)
    resp, err := http.DefaultClient.Do(req)

    defer resp.Body.Close()
    dec := json.NewDecoder(resp.Body)

    _, err = dec.Token()
    for dec.More() {
        var m Movie
        if err = dec.Decode(&m); err != nil {
            fmt.Println(err)
            fmt.Println("Bad", m)

            // https://blog.golang.org/error-handling-and-go
            if serr, ok := err.(*json.SyntaxError); ok {
                fmt.Println("Syntax error", serr)
            }

        } else {
            fmt.Println("Good", m)
        }

        tmpl := template.Must(template.New("test").Parse("OUTPUT: {{ if .Title }}{{.Title}}{{ if .PublishedAt }} was published at {{.PublishedAt}} {{ end }}{{end}}\n"))
        tmpl.Execute(os.Stdout, m)
    }

}

What am I missing? Any tools or strategies or suggestions would be much appreciated. My output currently looks like:

Good {foobar 2016-11-24 16:17:12 +0800 SGT}
OUTPUT: foobar was published at 2016-11-24 16:17:12 +0800 SGT
parsing time ""null"" as ""2006-01-02T15:04:05Z07:00"": cannot parse "null"" as "2006"
Bad {barbar 0001-01-01 00:00:00 +0000 UTC}
OUTPUT: barbar was published at 0001-01-01 00:00:00 +0000 UTC
Good { 1999-12-24 16:11:12 +0200 +0200}
OUTPUT:
Good {Something else entirely 2000-01-24 16:11:12 +0200 +0200}
OUTPUT: Something else entirely was published at 2000-01-24 16:11:12 +0200 +0200

But I need something like this in my stderr to better debug the issue:

Line 8: published_at is invalid

And maybe some context of the Title so I can tell the API backend team they have an error in their JSON response.

BONUS question: Furthermore I don't want to print the value 0001-01-01 00:00:00 +0000 UTC as it's actually really empty. I don't actually mind it being missing.

Upvotes: 7

Views: 13586

Answers (4)

Pavel Evstigneev
Pavel Evstigneev

Reputation: 5126

I found some solution:

if err := json.Unmarshal([]byte(data), &myStruct); err != nil {
    if jsonErr, ok := err.(*json.SyntaxError); ok {
        problemPart := data[jsonErr.Offset-10 : jsonErr.Offset+10]
        err = fmt.Errorf("%w ~ error near '%s' (offset %d)", err, problemPart, jsonErr.Offset)
    }
}

It will print something like

invalid character 'n' after object key:value pair ~ error near 'rence\","numberOfBil' (offset 14557)

Upvotes: 5

LeGEC
LeGEC

Reputation: 52151

One way to both accept null values, and to not print anything if published_at is null, is to set PublishedAt field to a pointer value :

type Movie struct {
    Title       string    `json:"title"`
    PublishedAt *time.Time `json:"published_at"`
}

The input string is valid JSON, so the json package does not raise a SyntaxError.

The json package has some other error types, such as UnmarshalTypeError, which is raised when an error occurs when the json does not match a nuilt-in type (e.g : string, int, array ...).

Unfortunately, when it calls a custom UnmarshalJSON() function, it looks like the json package returns the raw error :

package main

import (
    "bytes"
    "encoding/json"
    "fmt"
    "time"
)

// check the full type of an error raised when Unmarshaling a json string
func main() {
    var test struct {
        Clock time.Time
    }
    buf := bytes.NewBufferString(`{"Clock":null}`)
    dec := json.NewDecoder(buf)

    // ask to decode an invalid null value into a flat time.Time field :
    err := dec.Decode(&test)

    // print the details of the returned error :
    fmt.Printf("%#v\n", err)
}

// Output :
&time.ParseError{Layout:"\"2006-01-02T15:04:05Z07:00\"", Value:"null", LayoutElem:"\"", ValueElem:"null", Message:""}

https://play.golang.org/p/fhZxVpOflb

The final error comes straight from the time package, it is not some kind of UnmarshalError from the json package which could at least tell you "this error occured when trying to Unmarshal value at this offset", and the error alone will not give you the context.


You can look specifically for type *time.ParseError in the error :

if terr, ok := err.(*time.ParseError); ok {
    // in the example : Movie has one single time.Time field ;
    // if a time.ParseError occured, it was while trying to read that field
    fmt.Println("Error when trying to read 'published_at' value", terr)

    // you can leave the field to its zero value,
    // or if you switched to a pointer field :
    m.PublishedAt = nil
}

If you happen to have several time fields (e.g : ProducedAt and PublishedAt), you can still look which field was left with its zero value :

if terr, ok := err.(*time.ParseError); ok {
    if m.ProducedAt.IsZero() {
        fmt.Println("Error when trying to read 'produced_at' value", terr)
    }

    if m.PublishedAt == zero {
        fmt.Println("Error when trying to read 'published_at' value", terr)
    }
}

By the way : as specified in the docs, "0001-01-01 00:00:00 UTC" is the zero value that the go team chose for go's time.Time zero value.

Upvotes: 4

outdead
outdead

Reputation: 477

It looks like madness, but it should work:

rawBody := []byte(`{"title":"test", "published_at":"2017-08-05T15:04:05Z", "edited_at":"05.08.2017"}`)

type Movie struct {
   Title       string    `json:"title"`
   PublishedAt time.Time `json:"published_at"`
   EditedAt    time.Time `json:"edited_at"`
}

var msg Movie 

if err = json.Unmarshal(rawBody, &msg); err != nil {
    if _, ok := err.(*time.ParseError); ok {
        value := reflect.ValueOf(msg).Elem()

        if value.Kind().String() != "struct" {
            return err
        }

        for i := 0; i < value.NumField(); i++ {
            field := value.Field(i)

            if t, ok := field.Interface().(time.Time); ok {
                if t.IsZero() {
                    name := value.Type().Field(i).Name
                    return fmt.Errorf("field: %s, message: %s", strings.ToLower(name), "time is not in RFC 3339 format.")
                }
            }
        }
    }

    return err
}

This code will return first error happened. If PublishedAt is invalid we will know nothing about EditedAt even if it is valid.

Upvotes: 0

liwp_Stephen
liwp_Stephen

Reputation: 2870

Your data for published_at is "null", it is string type, so I think you can define the PublishedAt as string, and you can use code to parse it to time.Time.

This is my test code:

package main

import (
    "encoding/json"

    "github.com/swanwish/go-common/logs"
    "github.com/swanwish/go-common/utils"
)

func main() {
    url := `https://s.natalian.org/2016-12-07/debugme2.json`
    _, content, err := utils.GetUrlContent(url)
    if err != nil {
        logs.Errorf("Failed to get content from url %s, the error is %v", url, err)
        return
    }

    movies := []struct {
        Title       string `json:"title"`
        PublishedAt string `json:"published_at"`
    }{}
    err = json.Unmarshal(content, &movies)
    if err != nil {
        logs.Errorf("Failed to unmarshal content %s, the error is %v", string(content), err)
        return
    }
    logs.Debugf("The movies are %v", movies)
}

The result is:

The movies are [{foobar 2016-11-24T16:17:12.000+08:00} {barbar null} { 1999-12-24T16:11:12.000+02:00} {Something else entirely 2000-01-24T16:11:12.000+02:00}]

Upvotes: 0

Related Questions