Reputation: 13475
I would like to be able to write data directly to a bucket in AWS s3 from a data.frame
\ data.table
object as a csv file without writing it to disk first using the AWS CLI.
obj.to.write.s3 <- data.frame(cbind(x1=rnorm(1e6),x2=rnorm(1e6,5,10),x3=rnorm(1e6,20,1)))
at the moment I write to csv first then upload to an existing bucket then remove the file using:
fn <- 'new-file-name.csv'
write.csv(obj.to.write.s3,file=fn)
system(paste0('aws s3 ',fn,' s3://my-bucket-name/',fn))
system(paste0('rm ',fn))
I would like a function that writes directly to s3? is that possible?
Upvotes: 22
Views: 20293
Reputation: 14958
In aws.s3 0.2.2 the s3write_using()
(and s3read_using()
) functions were added.
They make things much simpler:
s3write_using(iris, FUN = write.csv,
bucket = "bucketname",
object = "objectname")
Upvotes: 32
Reputation: 44525
The easiest solution is just to save the .csv in a tempfile()
, which will be purged automatically when you close your R session.
If you need to only work in memory you can do this by doing write.csv()
to a rawConnection:
# write to an in-memory raw connection
zz <- rawConnection(raw(0), "r+")
write.csv(iris, zz)
# upload the object to S3
aws.s3::put_object(file = rawConnectionValue(zz),
bucket = "bucketname", object = "iris.csv")
# close the connection
close(zz)
In case you're unsure, you can then check that this worked correctly by downloading the object from S3 and reading it back into R:
# check that it worked
## (option 1: save locally)
save_object(object = "iris.csv", bucket = "bucketname", file = "iris.csv")
read.csv("iris.csv")
## (option 2: keep in memory)
read.csv(text = rawToChar(get_object(object = "iris.csv", bucket = "bucketname")))
Upvotes: 8
Reputation: 368241
Sure -- but 'saving to file' requires that your OS sees the desired target directory as an accessible filesystem. So in essence you "just" need to mount S3. Here is a quick Google search for that topic.
An alternative is writing to a temporary file, and then using whatever you use to transfer files. You could code up both operations as a simple helper function.
Upvotes: 0