Reputation: 107
In our go code using gocb we're querying a view that returns 32k ids. We then perform a bulk query (see code below) like explained in a CouchBase blog post. However, we only get partial results. We can see that ruleset, _ := items[i].(*gocb.GetOp).Value.(*RuleSet)
only returns a value for the first 2048 ids. Then the ids 2049 - 11322 do not contain a value and so on. Our result looks like so:
Line 1 Key: 12345678901234567890123456789012, Value: map[0.0.0.0/0:map[jsona:valueofjsona]]
...
Line 2018 Key: 12345678901234567890123456712345, Value: map[0.0.0.0/0:map[jsona:valueofjsona]]
Line 2019 Key: 12345678901234567890123456712345, Value: map[]
...
Line 11323 Key: 12345678901234567890123456712347, Value: map[jsonb:valueofjsonb]]
(The above lines are simplified, the keys don't match actual data, nor does the value.)
A huge portion of the requested data is not actually returned:
CB# grep '\[\]' result.out |wc -l
27042
CB# wc -l result.out
31988 rdmp.out
Does bucket.do
return before it has completed processing all queries? We looked at the API code and could not find an explanation.
Any idea how to solve this?
type RuleSet struct {
Rules map[string]interface{} "json:\"rules,\""
}
func DiffViaBulkQuery() {
var items []gocb.BulkOp
var row interface{}
var cnt int = 0
bucket := cbase.MyBucket()
// [...]
// add 600k entries to itemsget in a loop like
// itemsGet = append(itemsGet, &gocb.GetOp{Key: key + "_" + strconv.Itoa(i), Value: &Doc{}})
// Perform the bulk operation to Get all documents
err = bucket.Do(itemsGet)
if err != nil {
fmt.Println("ERRROR PERFORMING BULK GET:", err)
}
// Print the output
for i := 0; i < len(itemsGet); i++ {
fmt.Println(itemsGet[i].(*gocb.GetOp).Key, itemsGet[i].(*gocb.GetOp).Value.(*Doc).Item)
}
Thx in advance, Torsten
Upvotes: 1
Views: 294
Reputation: 829
It's worth checking the error value for each of the operations that you are performing. You can do this by doing op.Err
so, for example, that'd be
for i := 0; i < len(items); i++ {
fmt.Println(items[i].(*gocb.GetOp).Key, items[i].(*gocb.GetOp).Value.(*Doc).Item, items[i].(*gocb.GetOp).Err)
}
I expect that what you'll see is that you're hitting queue overflowed
errors which happens the gocb dispatcher queue becomes full, it defaults to a max size of 2048 items. The solution is usually to perform the work in smaller batches so as not to overload gocb. There is a similar issue with an example on https://forums.couchbase.com/t/bulk-upsert-data-into-couchbase/17354/2
Upvotes: 3