Reputation: 3886
I want to transfer a file from HDFS to S3 in Java. Some files may be huge, so I don't want to download my file locally before uploading it to S3. Is there any way to do that in Java?
Here's what I have right now (a piece of code that uploads a local file to S3). I can't really use this, because using the File object implies me having it on my HDD.
File f = new File("/home/myuser/test");
TransferManager transferManager = new TransferManager(credentials);
MultipleFileUpload upload = transferManager.uploadDirectory("mybucket","test_folder",f,true);
Thanks
Upvotes: 4
Views: 4259
Reputation: 3886
I figured out the uploading part.
AWSCredentials credentials = new BasicAWSCredentials(
"whatever",
"whatever");
File f = new File("/home/myuser/test");
TransferManager transferManager = new TransferManager(credentials);
//+upload from HDFS to S3
Configuration conf = new Configuration();
// set the hadoop config files
conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
conf.addResource(new Path("/etc/hadoop/conf/hdfs-site.xml"));
Path path = new Path("hdfs://my_ip_address/user/ubuntu/test/test.txt");
FileSystem fs = path.getFileSystem(conf);
FSDataInputStream inputStream = fs.open(path);
ObjectMetadata objectMetadata = new ObjectMetadata();
Upload upload = transferManager.upload("xpatterns-deployment-ubuntu", "test_cu_jmen3", inputStream, objectMetadata);
//-upload from HDFS to S3
try {
upload.waitForCompletion();
} catch (InterruptedException e) {
e.printStackTrace();
}
Any ideas about how to do something similar for downloading? I haven't found any download() method in TransferManager that can use a stream like in the above code.
Upvotes: 3