Reputation: 21
I am working on a middleware tool in which we have an predefined option of using java regular expressions with subStringRegEx( regex , string).
My requirement is to get the required substring between the underscores(_) from given filename( ex: ABC_XYZ_123_adbc1234-ed98_1234.dat
).
I have tried below 3 ways and all are working when tested with online tools by selecting java. Whereas not working as expected in my tool, I am getting “ABC_XYZ_123_ adbc1234-ed98” instead of only “adbc1234-ed98” value.
(?:[^_]+)_(?:[^_]+)_(?:[^_]+)_([^_]+)
.*?_.*?_.*?_([^_]+)
^[^_]*_[^_]*_[^_]*_([^_]*)_
Request your suggestions to achieve the solution.
Thanks, Kumar
Upvotes: 2
Views: 130
Reputation: 163342
Just for completeness, all 3 patterns work but you have to get the value from group 1.
Example
String patterns[] = {
"(?:[^_]+)_(?:[^_]+)_(?:[^_]+)_([^_]+)",
".*?_.*?_.*?_([^_]+)",
"^[^_]*_[^_]*_[^_]*_([^_]*)_"
};
String s = "ABC_XYZ_123_adbc1234-ed98_1234.dat";
for (String p : patterns) {
Pattern pattern = Pattern.compile(p);
Matcher matcher = pattern.matcher(s);
if (matcher.find()) {
System.out.println(matcher.group(1));
}
}
Output
adbc1234-ed98
adbc1234-ed98
adbc1234-ed98
See a Java demo.
Upvotes: 1
Reputation:
I'm not sure about the spec for subStringRegEx (regex, string)
, but if it returns a substring ($0
) in string
that matches regex
, then it should be
String regex = "[^_]+(?=_[^_]*$)";
Upvotes: 0
Reputation: 153
You can simply use the String methods to achieve this:
const str = "ABC_XYZ_123_adbc1234-ed98_1234.dat"
const charSet = str.substr(0, str.length-4).split("_").join("")
console.log(charSet)
Upvotes: 0
Reputation: 133518
With your shown samples, please try following regex. Value is coming in capture group 1, so do replace with $1 while performing substitution.
^(?:.*?_){3}([^_]*)_.*\.dat$
OR in case format of files could be anything(apart from .dat
) then try following.
^(?:.*?_){3}([^_]*)_.*
Explanation: Adding detailed explanation for above regex.
^(?:.*?_){3} ##Matching from starting of value, using non greedy match till _ 3 times in a non capturing group.
([^_]*) ##Creating 1st capturing group which has values till 1st Occurrence of _ in it.
_.*\.dat$ ##Matching from _ to till dat at the end of value.
Upvotes: 4
Reputation: 626816
You can use
^(?:[^_]+_){3}([^_]+).*
and replace with $1
. See the regex demo.
Details:
^
- start of string(?:[^_]+_){3}
- three occurrences of any one or more chars other than _
and then a _
char([^_]+)
- Group 1 (referred to with $1
from the replacement pattern): one or more chars other than _
.*
- the rest of the string.Another idea:
^.*_([^_]+)_[0-9]+\.[^._]*$
See this regex demo, and you will still need to replace with $1
.
Details:
^
- start of string.*
- any text (not including line break chars, as many as possible)_
- a _
char([^_]+)
- one or more chars other than _
_
- a _
char[0-9]+
- one or more digits\.
- a .
char (NOTE: \
might need doubling)[^._]*
- any zero or more chars other than .
and _
$
- end of string.Upvotes: 1