Reputation: 121
I have to check if a file name ends with a gzip extension. In particular I'm looking for two extensions: ".tar.gz" and ".gz". I would like to capture the file name (and path) as a group using a single regular expression excluding the gzip extension if any. I tested the following regular expressions on this example path
String path = "/path/to/file.txt.tar.gz";
Expression 1:
String rgx = "(.+)(?=([\\.tar]?\\.gz)$)";
Expression 2:
String rgx = "^(.+)[\\.tar]?\\.gz$";
Extracting group 1 in this way:
Matcher m = Pattern.compile(rgx).matcher(path);
if(m.find()){
System.out.println(m.group(1));
}
Both regular expressions give me the same result: /path/to/file.txt.tar
and not /path/to/file.txt
.
Any help will be appreciated.
Thanks in advance
Upvotes: 3
Views: 11734
Reputation: 421220
You need to make the part that matches the file name reluctant, i.e. change (.+)
to (.+?)
:
String rgx = "^(.+?)(\\.tar)?\\.gz";
// ^^^
Now you get:
Matcher m = Pattern.compile(rgx).matcher(path);
if(m.find()){
System.out.println(m.group(1)); // /path/to/file.txt
}
Upvotes: 3
Reputation: 48444
You can use the following idiom to match both your path+file name, an gzip extensions in one go:
String[] inputs = {
"/path/to/foo.txt.tar.gz",
"/path/to/bar.txt.gz",
"/path/to/nope.txt"
};
// ┌ group 1: any character reluctantly quantified
// | ┌ group 2
// | | ┌ optional ".tar"
// | | | ┌ compulsory ".gz"
// | | | | ┌ end of input
Pattern p = Pattern.compile("(.+?)((\\.tar)?\\.gz)$");
for (String s: inputs) {
Matcher m = p.matcher(s);
if (m.find()) {
System.out.printf("Found: %s --> %s %n", m.group(1), m.group(2));
}
}
Output
Found: /path/to/foo.txt --> .tar.gz
Found: /path/to/bar.txt --> .gz
Upvotes: 4
Reputation: 174826
Use a capturing group based regex.
^(.+)/(.+)(?:\\.tar)?\\.gz$
And,
Get the path from index 1.
Get the filename from index 2.
Upvotes: 1