Reputation: 55
Is there an API to read aws s3 file in Go, I only find the API to download file to local machine, and then read the local downloaded file, but I need to read file in stream (like reading a local file).
I want to be able to read the file in real time, like read 100 bytes, do something to the 100 bytes, and read the last file. I only find the Go aws s3 API to download the entire file to local machine, and the handle the downloaded local file.
My current test code is this
func main() {
bucket := "private bucket"
item := "private item"
file, err := os.Create("local path")
if err != nil {
exitErrorf("Unable to open file %q, %v", item, err)
}
defer file.Close()
sess, _ := session.NewSession(&aws.Config{
Region: aws.String(" ")},
)
downloader := s3manager.NewDownloader(sess)
numBytes, err := downloader.Download(file,
&s3.GetObjectInput{
Bucket: aws.String(bucket),
Key: aws.String(item),
})
// Handle the downloaded file
scanner := bufio.NewScanner(file)
for scanner.Scan() {
// Do something
}
}
I will download the file from s3 to local machine and then open the downloaded file and handle each byte.
I wonder can i directly read each line of the file(or read each 100 bytes of the file) from s3
Upvotes: 2
Views: 4894
Reputation: 764
Download() takes a WriterAt, but you want a Reader to read from. You can achieve this in four steps:
Create a fake WriterAt to wrap a Writer:
type FakeWriterAt struct {
w io.Writer
}
func (fw FakeWriterAt) WriteAt(p []byte, offset int64) (n int, err error) {
return fw.w.Write(p)
}
Create an io.Pipe to have the ability to read what is written to a writer:
r, w := io.Pipe()
Set concurrency to one so the download will be sequential:
downloader.Concurrency = 1
Wrap the writer created with io.Pipe() with the FakeWriterAt created in the first step. Use the Download function to write to the wrapped Writer:
go func() {
defer w.Close()
downloader.Download(FakeWriterAt{w},
&s3.GetObjectInput{
Bucket: aws.String(bucket),
Key: aws.String(key),
})
}()
You can now use the reader from the io.Pipe to read from S3.
The minimum part size is 5 MB according to the documentaiton.
Reference: https://dev.to/flowup/using-io-reader-io-writer-in-go-to-stream-data-3i7b
Upvotes: 3
Reputation: 935
As far as i understand, you probably need a Range
request to get file chunk by chunk.
Here is some pseudo-code:
// Setup input
input := &s3.GetObjectInput{
Bucket: aws.String(BucketName),
Key: aws.String(Path),
}
// calculate position
input.Range = aws.String(fmt.Sprintf("bytes=%d-%d", Position, Offset))
// Get particular chunk of object
result, err := o.Service().GetObject(input)
if err != nil {
return nil, err
}
defer result.Body.Close()
// Read the chunk
b, err := ioutil.ReadAll(result.Body)
Or, if you in some case need a file at once (i can't recommend it), just omit Range
and that's it.
Upvotes: 1