lihaichao
lihaichao

Reputation: 55

Is there an AWS S3 Go API for reading file instead of download file?

Is there an API to read aws s3 file in Go, I only find the API to download file to local machine, and then read the local downloaded file, but I need to read file in stream (like reading a local file).

I want to be able to read the file in real time, like read 100 bytes, do something to the 100 bytes, and read the last file. I only find the Go aws s3 API to download the entire file to local machine, and the handle the downloaded local file.

My current test code is this

func main() {
    bucket := "private bucket"
    item := "private item"

    file, err := os.Create("local path")
    if err != nil {
        exitErrorf("Unable to open file %q, %v", item, err)
    }

    defer file.Close()

    sess, _ := session.NewSession(&aws.Config{
        Region: aws.String(" ")},
    )

    downloader := s3manager.NewDownloader(sess)

    numBytes, err := downloader.Download(file,
        &s3.GetObjectInput{
            Bucket: aws.String(bucket),
            Key:    aws.String(item),
        })

    // Handle the downloaded file
    scanner := bufio.NewScanner(file)

    for scanner.Scan() {             
        // Do something
    }
}

I will download the file from s3 to local machine and then open the downloaded file and handle each byte.

I wonder can i directly read each line of the file(or read each 100 bytes of the file) from s3

Upvotes: 2

Views: 4894

Answers (2)

SysCoder
SysCoder

Reputation: 764

Download() takes a WriterAt, but you want a Reader to read from. You can achieve this in four steps:

Create a fake WriterAt to wrap a Writer:

type FakeWriterAt struct {
    w io.Writer
}

func (fw FakeWriterAt) WriteAt(p []byte, offset int64) (n int, err error) {
    return fw.w.Write(p)
}

Create an io.Pipe to have the ability to read what is written to a writer:

r, w := io.Pipe()

Set concurrency to one so the download will be sequential:

downloader.Concurrency = 1

Wrap the writer created with io.Pipe() with the FakeWriterAt created in the first step. Use the Download function to write to the wrapped Writer:

go func() {
    defer w.Close()
    downloader.Download(FakeWriterAt{w},
        &s3.GetObjectInput{
            Bucket: aws.String(bucket),
            Key:    aws.String(key),
        })
}()

You can now use the reader from the io.Pipe to read from S3.

The minimum part size is 5 MB according to the documentaiton.

Reference: https://dev.to/flowup/using-io-reader-io-writer-in-go-to-stream-data-3i7b

Upvotes: 3

tacobot
tacobot

Reputation: 935

As far as i understand, you probably need a Range request to get file chunk by chunk.
Here is some pseudo-code:

// Setup input
input := &s3.GetObjectInput{
    Bucket: aws.String(BucketName),
    Key:    aws.String(Path),
}

// calculate position
input.Range = aws.String(fmt.Sprintf("bytes=%d-%d", Position, Offset))

// Get particular chunk of object
result, err := o.Service().GetObject(input)
if err != nil {
    return nil, err
}
defer result.Body.Close()

// Read the chunk
b, err := ioutil.ReadAll(result.Body)

Or, if you in some case need a file at once (i can't recommend it), just omit Range and that's it.

Upvotes: 1

Related Questions