Reputation: 7219
I am reading in a .json
file. It's an array of objects in valid JSON format, example:
[
{
"Id": 13,
"Location": "Australia",
"Content": "Another string"
},
{
"Id": 145,
"Location": "England",
"Content": "SomeString"
},
{
"Id": 12,
"Location": "England",
"Content": "SomeString"
},
{
"Id": 12331,
"Location": "Sweden",
"Content": "SomeString"
},
{
"Id": 213123,
"Location": "England",
"Content": "SomeString"
}
]
I want to filter these objects out - say, removing anything where "Location"
doesn't equal "England"
.
What I've tried so far is creating a custom UnmarshalJSON
function. It does unmarshal it, but the objects it produces are empty - and as many as the input.
Sample code:
type languageStruct struct {
ID int `json:"Id"`
Location string `json:"Location"`
Content string `json:"Content"`
}
func filterJSON(file []byte) ([]byte, error) {
var x []*languageStruct
err := json.Unmarshal(file, &x)
check(err)
return json.MarshalIndent(x, "", " ")
}
func (s *languageStruct) UnmarshalJSON(p []byte) error {
var result struct {
ID int `json:"Id"`
Location string `json:"Location"`
Content string `json:"Content"`
}
err := json.Unmarshal(p, &result)
check(err)
// slice of locations we'd like to filter the objects on
locations := []string{"England"} // Can be more
if sliceContains(s.Location, locations) {
s.ID = result.ID
s.Location= result.Location
s.Content = result.Content
}
return nil
}
// helper func to check if a given string, f.e. a value of a key-value pair in a json object, is in a provided list
func sliceContains(a string, list []string) bool {
for _, b := range list {
if b == a {
fmt.Println("it's a match!")
return true
}
}
return false
}
While this runs - the output is wrong. It creates as many objects as comes in - however, the new ones are empty, f.e.:
// ...
[
{
"Id": 0,
"Location": "",
"Content": ""
},
{
"Id": 0,
"Location": "",
"Content": ""
}
]
//...
Whereas my desired output, from the first given input, would be:
[
{
"Id": 145,
"Location": "England",
"Content": "SomeString"
},
{
"Id": 12,
"Location": "England",
"Content": "SomeString"
},
{
"Id": 213123,
"Location": "England",
"Content": "SomeString"
}
]
Upvotes: 4
Views: 6659
Reputation: 418107
When languageStruct.UnmarshalJSON()
is called, there is already a languageStruct
prepared that will be appended to the slice, no matter if you fill its content (fields) or not.
The easiest and my suggested solution is to just unmarshal normally, and post-process the slice: remove elements according to your requirements. This results in clean code, which you can easily adjust / alter in the future. Although it could be implemented as custom marshaling logic on a custom slice type []languageStruct
, I would still not create custom marshaling logic for this but implement it as a separate filtering logic.
Here's a simple code unmarshaling, filtering and marshaling it again (note: no custom marshaling is defined / used for this):
var x []*languageStruct
err := json.Unmarshal(file, &x)
if err != nil {
panic(err)
}
var x2 []*languageStruct
for _, v := range x {
if v.Location == "England" {
x2 = append(x2, v)
}
}
data, err := json.MarshalIndent(x2, "", " ")
fmt.Println(string(data), err)
This will result in your desired output. Try it on the Go Playground.
The fastest and most complex solution would be to use event-driven parsing and building a state machine, but the complexity would increase by large. The idea would be to process the JSON by tokens, track where you're at currently in the object tree, and when an object is detected that must be excluded, don't process / add it to your slice. For details and ideas how this can be written, check out this anwser: Go - Decode JSON as it is still streaming in via net/http
Upvotes: 5