Reputation: 65
I'm trying to match a String like this:62.00|LQ+2*2,FP,MD*3 "Description"
Where the decimal value is 2 digits optional, each user is characterized by two Chars and it can be followed by
(\+[\d]+)? or (\*[\d]+)? or none, or both, or both in different order
like:
LQ*2+4 | LQ+4*2 | LQ*2 | LQ+8 | LQ
Description is also optional
What i have tried is this:
Pattern.compile("^(?<number>[\\d]+(\\.[\\d]{2})?)\\|(?<users>([A-Z]{2}){1}(((\\+[\\d]+)?(\\*[\\d]+)?)|((\\+[\\d]+)?(\\*[\\d]+)?))((,[A-Z]{2})(((\\+[\\d]+)?(\\*[\\d]+)?)|((\\+[\\d]+)?(\\*[\\d]+)?)))*)(\\s\\\"(?<message>.+)\\\")?$");
I need to get all the users so i can split them by ',' and then further regex my way into it.But i cannot grab anything out of it.The desired output from
62.00|LQ+2*2,FP,MD*3 "Description"
Should be:
62.00
LQ+2*2,FP,MD*3
Description
Accepted inputs should be of these kind:
62.00|LQ+2*2,FP,MD*3
30|LQ "Burgers"
35.15|LQ*2,FP+2*4,MD*3+4 "Potatoes"
35.15|LQ,FP,MD
Upvotes: 2
Views: 88
Reputation: 27723
I'm guessing that we have several optional groups here, that might not be a problem. The problem I'm having is that I'm not quite sure what would be the range of our inputs and what might be desired outputs.
If we are just matching everything, that I'm guessing, we might like to start with something similar to:
[0-9]+(\.[0-9]{2})?\|[A-Z]{2}[+*]?([0-9]+)?[+*]?([0-9]+)?,[A-Z]{2},[A-Z]{2}[+*]?([0-9]+)?(\s+"Description")?
Here, we simply add a ?
after every sub-expression that we wish to have it optional, then we use char lists and quantifiers, and start swiping everything from left to right, to cover all inputs.
If we like to capture, then we simply wrap any part that we want captured with a capturing group ()
.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "[0-9]+(\\.[0-9]{2})?\\|[A-Z]{2}[+*]?([0-9]+)?[+*]?([0-9]+)?,[A-Z]{2},[A-Z]{2}[+*]?([0-9]+)?(\\s+\"Description\")?";
final String string = "62.00|LQ+2*2,FP,MD*3 \"Description\"\n"
+ "62|LQ+2*2,FP,MD*3 \"Description\"\n"
+ "62|LQ+2*2,FP,MD*3\n"
+ "62|LQ*2,FP,MD*3\n"
+ "62|LQ+8,FP,MD*3\n"
+ "62|LQ,FP,MD";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}
If we wish to output three groups that is listed:
([0-9]+(\.[0-9]{2})?)\|([A-Z]{2}[+*]?([0-9]+)?[+*]?([0-9]+)?,[A-Z]{2},[A-Z]{2}[+*]?([0-9]+)?)(\s+"Description")?
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "([0-9]+(\\.[0-9]{2})?)\\|([A-Z]{2}[+*]?([0-9]+)?[+*]?([0-9]+)?,[A-Z]{2},[A-Z]{2}[+*]?([0-9]+)?)(\\s+\"Description\")?";
final String string = "62.00|LQ+2*2,FP,MD*3 \"Description\"\n"
+ "62|LQ+2*2,FP,MD*3 \"Description\"\n"
+ "62|LQ+2*2,FP,MD*3\n"
+ "62|LQ*2,FP,MD*3\n"
+ "62|LQ+8,FP,MD*3\n"
+ "62|LQ,FP,MD";
final String subst = "\\1\\n\\3\\n\\7";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
// The substituted value will be contained in the result variable
final String result = matcher.replaceAll(subst);
System.out.println("Substitution result: " + result);
Based on updated desired output, this might work:
([0-9]+(\.[0-9]{2})?)\|((?:[A-Z]{2}[+*]?([0-9]+)?[+*]?([0-9]+)?,?)(?:[A-Z]{2}[+*]?([0-9]+)?[*+]?([0-9]+)?,?[A-Z]{2}?[*+]?([0-9]+)?[+*]?([0-9]+)?)?)(\s+"(.+?)")?
Upvotes: 1
Reputation: 18357
The precise regex to match the inputs you described should be fulfilled by this regex,
^(\d+(?:\.\d{1,2})?)\|([a-zA-Z]{2}(?:(?:\+\d+(?:\*\d+)?)|(?:\*\d+(?:\+\d+)?))?(?:,[a-zA-Z]{2}(?:(?:\+\d+(?:\*\d+)?)|(?:\*\d+(?:\+\d+)?))?)*)(?: +(.+))?$
Where group1 will contain the number that can have optional decimals upto two digits and group2 will have the comma separated inputs as you described in your post and group3 will contain the optional description if present.
Explanation of regex:
^
- Start of string(\d+(?:\.\d{1,2})?)
- Matches the number which can have optional 2 digits after decimal and captures it in group1\|
- Matches literal |
present in your input after the number([a-zA-Z]{2}(?:(?:\+\d+(?:\*\d+)?)|(?:\*\d+(?:\+\d+)?))?(?:,[a-zA-Z]{2}(?:(?:\+\d+(?:\*\d+)?)|(?:\*\d+(?:\+\d+)?))?)*)
- This part matches two letters followed by any combination of +
followed by number and optionally having *
followed by number OR *
followed by number and optionally having +
followed by number exactly either once or whole of it being optional and captures it in group2(?: +(.+))?
- This matches the optional description and captures it in group3$
- Marks end of inputUpvotes: 2