rosed
rosed

Reputation: 157

Use go worker pool implementation to write files in parallel?

I have a slice clientFiles which I am iterating over sequentially and writing it in S3 one by one as shown below:

for _, v := range clientFiles {
  err := writeToS3(v.FileContent, s3Connection, v.FileName, bucketName, v.FolderName)
  if err != nil {
        fmt.Println(err)
    }
}

The above code works fine but I want to write in S3 in parallel so that I can speed things up. Does worker pool implementation works better here or is there any other better option here? I got below code which uses wait group but I am not sure if this is better option here to work with?

wg := sync.WaitGroup{}
for _, v := range clientFiles {
  wg.Add(1)
  go func(v ClientMapFile) {
    err := writeToS3(v.FileContent, s3Connection, v.FileName, bucketName, v.FolderName)
    if err != nil {
      fmt.Println(err)
    }      
  }(v)
}

Upvotes: 0

Views: 682

Answers (1)

izbudki
izbudki

Reputation: 46

Yes, parallelising should help.

Your code should work well after changes regarding usage of WaitGroup. You need to mark work as Done and wait for all goroutines to finish after for-loop.

var wg sync.WaitGroup
for _, v := range clientFiles {
  wg.Add(1)
  go func(v ClientMapFile) {
    defer wg.Done()
    err := writeToS3(v.FileContent, s3Connection, v.FileName, bucketName, v.FolderName)
    if err != nil {
      fmt.Println(err)
    }      
  }(v)
}
wg.Wait()

Be aware that your solution creates N goroutines for N files, which can be not optimal if number of files is very big. In such case use this pattern https://gobyexample.com/worker-pools and try different number of workers to find which works best for you in terms of performance.

Upvotes: 2

Related Questions