Reputation: 54
I've tried the following regx (java string format):
^(.*(iOS\\s+[\\d\\.]+|Android\\s+[\\d\\.]+)?.*)$
String to match is :
Some Money 2.6.2; iOS 5.1.1
It supposes to return three groups :
group[0] :Some Money 2.6.2; iOS 5.1.1
group[1] :Some Money 2.6.2; iOS 5.1.1
group[2] :iOS 5.1.1
but it actually returns these:
group[0] :Some Money 2.6.2; iOS 5.1.1
group[1] :Some Money 2.6.2; iOS 5.1.1
group[2] :null
when i change regex as below
^(.*(iOS\\s+[\\d\\.]+|Android\\s+[\\d\\.]+).*)$
but it can't match string like
whatever iS 5.1.1 whatever
What i want to achieve is the regex returns three groups no matter what string likes.The first and second group always to be the entire string . The third group is the substring that matches '(iOS|Android) [\d.]*' if string does contains that part and is null or empty if it doesn't contain.
Upvotes: 1
Views: 450
Reputation: 54
I finally solved the problem by regex as below.
(.*((?:iOS|Android)\\s+[0-9\\.]+).*|.*)
Upvotes: 0
Reputation: 1421
Maybe you can use the ;
delimiter as indication that your iOS 5.1.1
part starts?
Then a pattern may look like .+;\\s+(.+)
.
.+;
consumes everything up to the semi-colon\\s+
consumes the spaces between semi-colon and the start of the version string(.+)
consumes everything up to the endIf you really only want to match iOS or Android then you might want to add a non capturing group within the (.+)
part.
A regexp then would look like this: ".+;\\s+((?:iOS|Android).+)"
.
And here a executable example what a solution may look like. It shows the behaviour of both pattern variants I explained above.
public static void main(String[] args) {
String input1 = "Some Money 2.6.2; iS 5.1.1 ";
String input2 = "Some Money 2.6.2; iOS 5.1.1 ";
String input3 = "Some Money 2.6.2; Android 5.1.1 ";
String pattern1 = ".+;\\s+(.+)";
String pattern2 = ".+;\\s+((?:iOS|Android).+)";
System.out.println(pattern1);
matchPattern(input1, pattern1);
matchPattern(input2, pattern1);
matchPattern(input3, pattern1);
System.out.println();
System.out.println(pattern2);
matchPattern(input1, pattern2);
matchPattern(input2, pattern2);
matchPattern(input3, pattern2);
}
private static void matchPattern(String input, String pattern) {
Pattern p = Pattern.compile(pattern);
Matcher m = p.matcher(input);
if(m.matches()) {
System.out.println(m.group(0));
System.out.println(m.group(1));
if(m.groupCount() > 1) {
System.out.println(m.group(2));
}
}
}
Update: Since the target of the question got clearer due to some edits by the author, I feel the need to update my answer. If it is about always getting three groups, the following might be better than working out all possible notation variants:
public static void main(String[] args) {
String input1 = "Some Money 2.6.2; iS 5.1.1";
String input2 = "Some Money 2.6.2; iOS 5.1.1";
String input3 = "Some Money 2.6.2; Android 5.1.1";
String input4 = "Some Money 2.6.2 iOS 5.1.1";
String input5 = "Some Money 2.6.2 iOS";
String input6 = "Some Money 2.6.2";
String pattern1 = "(.*?((?:iOS|Android)(?:\\s+[0-9\\.]+)?.*)?)";
System.out.println(pattern1);
matchPattern(input1, pattern1);
matchPattern(input2, pattern1);
matchPattern(input3, pattern1);
matchPattern(input4, pattern1);
matchPattern(input5, pattern1);
matchPattern(input6, pattern1);
}
private static void matchPattern(String input, String pattern) {
Pattern p = Pattern.compile(pattern);
Matcher m = p.matcher(input);
if(m.matches()) {
System.out.println(m.group(0));
System.out.println(m.group(1));
System.out.println(m.group(2));
System.out.println();
}
}
Here the pattern is (.*?(?:((?:iOS|Android)(?:\\s+[0-9\\.]+)?).*)?)
.
.*?
consumes everything before the version string. If no version string is available at all it matches the whole input. The Reluctant quantifier is needed here. It takes the shortest match that still matches and so avoids that the whole input is consumed.(?:((?:iOS|Android)(?:\\s+[0-9\\.]+)?).*)?
consumes the whole version string and everything that is following.((?:iOS|Android)(?:\\s+[0-9\\.]+)?)
is the group(2) output. It just matches the OS string, iOS or Android, with an optional version suffix consisting of numbers and dot.Upvotes: 2
Reputation: 79
please refer this topic about "How a RegEx engine works".
- Those based on back-tracking. These often compile the pattern into byte-code, resembling machine instructions. The engine then executes the code, jumping from instruction to instruction. When an instruction fails, it then back-tracks to find another way to match the input.
Your regular expression have many way to match the input. And sadly, it return the other way (not your expected matches).
By removing "?" quantifier from the 2nd group, it becomes "required". Your returned maches will match all required groups.
Upvotes: 0