Reputation: 886
I have a scenario where I need to range over (as many as possible) map entries and send them into a channel. The operation at the other end of the channel can take a long amount of time and the map is accessed concurrently (and protected by an RWMutex
). The map is also rather big and I want to avoid creating a temporary copy of it.
Assume I have a struct like this:
type Example struct {
sync.RWMutex
m map[string]struct{}
}
Now I came up with something like this:
func (e *Example) StreamAll() <-chan string {
toReturn := make(chan string)
go func() {
e.RLock()
defer e.RUnlock()
for k := range e.m {
e.RUnlock()
toReturn <- k
e.RLock()
}
close(toReturn)
}()
return toReturn
}
The language specification has this interesting bit about ranging over maps:
If map entries that have not yet been reached are removed during iteration, the corresponding iteration values will not be produced. If map entries are created during iteration, that entry may be produced during the iteration or may be skipped.
Now, what I'd like to know is this: Is there a guarantee that my method of ranging over the map works even if the map is changed between iterations? Including the case where the key I last read is deleted? I don't need all of the map entries, but most of them.
Here is a complete example:
package main
import (
"fmt"
"sync"
)
type Example struct {
sync.RWMutex
m map[string]struct{}
}
func NewExample() *Example {
return &Example{
m: make(map[string]struct{}),
}
}
func (e *Example) Put(s string) {
e.Lock()
defer e.Unlock()
e.m[s] = struct{}{}
}
func (e *Example) Delete(s string) {
e.Lock()
defer e.Unlock()
delete(e.m, s)
}
func (e *Example) StreamAll() <-chan string {
toReturn := make(chan string)
go func() {
e.RLock()
defer e.RUnlock()
for k := range e.m {
e.RUnlock()
toReturn <- k
e.RLock()
}
close(toReturn)
}()
return toReturn
}
func main() {
e := NewExample()
e.Put("a")
e.Put("b")
values := e.StreamAll()
// Assume other goroutines concurrently call Put and Delete on e
for k := range values {
fmt.Println(k)
}
}
Upvotes: 0
Views: 1476
Reputation: 13065
I see 3 choices:
0) Inconsistent Snapshot This is what you have: Your map is changing as you generate keys, so you get what you get. I'm not entirely sure your locking is correct. It looks really suspicious to me. Test it extensively with the race detector, of course.
1) "Stop The World" - You can block write access to your map while you generate all the keys. Generating keys is much quicker than processing the items, and will give you a perfect consistent snapshot of the items to process. Unfortunately, when you send the keys to co-routines, those keys may not exist by the time you get around to processing it. It sounds like you are OK with that. It does require storing a copy of all the keys, so hopefully this is OK.
2) Roll your own MVCC (Multi-Version Concurrency Control) - Instead of using 1 map, use 2. Let's call them A and B. The idea is to only write to the first map while you process the second map, then flip roles.
When it's time to start your background jobs, just take out a RW lock, while you flip the boolean. Now you can iterate over the keys of the "fall back" map, calling a gouroutine for each one. The key is guaranteed to exist, since nobody is writing to that map.
You can delete from B in the goroutine (requires using locking when reading and again while deleting).
But it might be better/simpler to just process all the entries with NO locking (since everything is just reading), then wait for all the goroutines to be done, then wipe out B by doing "B = make()" to get an empty map. This will free up all your memory at once, and save some accounting that needs to be done after ever delete.
Once you wipe the map (or have deleted all the entries), you can take the RW lock while you flip the boolean the other way, then start processing the other map.
The downside is that you will end up with 2 copies of the map if you update items frequently. If this is the case: 1) have WRITES check the fallback map. If it's there, update it, otherwise update the main map. 2) Delete the item from the map BEFORE background processing. (You can't use the bulk delete trick.)
Upvotes: 1