Reputation: 516
I need to get the number of json objects in a given file. The File contains an array of JSON objects. I observe that its taking approximately 150-180 seconds to count a file with 1 million objects. Is there a way I can optimize the below code to get the count faster?
func Count(file string) (int, error) {
f, err := os.Open(file)
if err != nil {
return -1, err
}
defer f.Close()
dec := json.NewDecoder(bufio.NewReader(f))
_, e := dec.Token()
if e != nil {
return -1, e
}
var count int
for dec.More() {
var tempMap map[string]interface{}
readErr := dec.Decode(&tempMap)
if readErr != nil {
return -1, readErr
}
tranCount++
}
return count, nil
}
Upvotes: 0
Views: 279
Reputation:
Speed things up by counting start object delimiters instead of decoding to Go values.
Based on the code in the question, it looks like your goal is to count objects at the first level of nesting in the document. Here's code that does that:
func Count(r io.Reader) (int, error) {
dec := json.NewDecoder(r)
nest := 0
count := 0
for {
t, err := dec.Token()
if err == io.EOF {
break
}
if err != nil {
return -1, err
}
switch t {
case json.Delim('{'):
if nest == 1 {
count++
}
nest++
case json.Delim('}'):
nest--
}
}
return count, nil
}
If your goal is to count all objects, remove all uses of nest
from the code above:
func Count(r io.Reader) (int, error) {
dec := json.NewDecoder(r)
count := 0
for {
t, err := dec.Token()
if err == io.EOF {
break
}
if err != nil {
return -1, err
}
switch t {
case json.Delim('{'):
count++
}
}
return count, nil
}
Upvotes: 1