Reputation: 4180
I currently have a S3 bucket directory key like this:
String dir = "s3://mybucket/workflow/science/sweet-humoor/vars";
What I am trying to do is to get the prefix of this S3 directory, a prefix is actually without s3:://mybucket/
, so what I want to have is workflow/science/sweet-humoor/vars
Now, what would be a elegant way to achieve this? I know the quickest way to do is to do a subString(13)
, but this will break whenever the bucket name changes.
How would you handle this?
Upvotes: 0
Views: 141
Reputation: 22997
The URIBuilder
class from the org.apache.http.client.utils
package can do that.
URIBuilder builder = new URIBuilder(dir);
String thePath = builder.getPath();
This automatically extracts /workflow/science/sweet-humoor/vars
from the path. The retrieved path does not include mybucket
, because URIBuilder
sees the first part immediately after the protocol specifier (s3://
) as hostname.
Further processing can be done through Path p = Paths.get(thePath)
.
Upvotes: 0
Reputation: 2861
String dir = "s3://mybucket/workflow/science/sweet-humoor/vars";
dir = dir.replace("//", "").substring( dir.indexOf("/") );
System.err.println(dir); // prints mybucket/workflow/science/sweet-humoor/vars
Upvotes: 0
Reputation: 3081
You can try this:
String dir2=dir.replaceAll("s3://"+dir.split("/")[2]+"/","");
Upvotes: 0
Reputation: 15008
It's cleanest to use the Java library functions for paths instead of handling the Strings directly. What you have is an URL, so
URL url = new URL(dir);
URI uri = url.toURI();
Path fullpath = Paths.get(uri);
Now you have a Path
(ie the "/mybucket/workflow/science/sweet-humoor/vars" part), and you can get the subpath by
// start index 1 to skip the first directory element
Path subpath = fullpath.subpath(1, fullpath.getNameCount()-1);
You can make a File
out of this (subpath.toFile()
), or just get the path string by
subpath.toString();
Upvotes: 1
Reputation: 59
I would split the string by "/" and get the values from third index and join it with "/". Sample code in python.
input_string = "s3://mybucket/workflow/science/sweet-humoor/vars"
list1 = (input_string.split("/"))
print(list1)
print("/".join(list1[3:]))
Output: workflow/science/sweet-humoor/vars
Upvotes: -1
Reputation: 271775
Use a regular expression with replaceAll
:
String result = directoryKey.replaceAll("s3://[^/]+/", "");
The regex here is:
s3://[^/]+/
It matches the part that you want to remove, which is s3://
followed by a bunch of non-slash characters, followed by a slash.
Upvotes: 1