user3803555
user3803555

Reputation: 81

To access S3 bucket from R

I have set-up R on an EC2 Instance on AWS. I have few csv files uploaded into a S3 bucket. I was wondering if there is a way to access the csv files in the S3 bucket from R.

Any help/pointers would be appreciated.

Upvotes: 8

Views: 8217

Answers (3)

John Sandall
John Sandall

Reputation: 449

Have a look at the cloudyr aws.s3 package (https://github.com/cloudyr/aws.s3), it might do what you need. Unfortunately (at time of writing), this package is quite early stage & a little unstable.

I've had good success simply using R's system() command to make a call to the AWS CLI. This is relatively easy to get started on, very robust and very well supported.

  1. Start here: http://aws.amazon.com/cli/
  2. List objects using S3 API: http://docs.aws.amazon.com/cli/latest/reference/s3api/list-objects.html
  3. Get objects using S3 API: http://docs.aws.amazon.com/cli/latest/reference/s3api/get-object.html

So, for example, on command-line try following:

pip install awscli
aws configure
aws s3 help
aws s3api list-objects --bucket some-bucket --query 'Contents[].{Key: Key}'
aws s3api get-object --bucket some-bucket --key some_file.csv new_file_name.csv

In R, can just do something like:

system("aws s3api list-objects --bucket some-bucket --query 'Contents[].{Key: Key}' > my_bucket.json")

Upvotes: 8

shizik
shizik

Reputation: 17

Install the libdigest-hmac-perl package;

sudo apt-get install libdigest-hmac-perl

Upvotes: -2

jppodo
jppodo

Reputation: 19

Enter the following command: install.packages("AWS.tools")

From there, use the s3.get() command. The Help tab should tell you what goes in for arguments.

Upvotes: 0

Related Questions