Reputation: 299
this is my regex:
([+-]*)(\\d+)\\s*([a-zA-Z]+)
The thing is, I would like to match given input but it can be "chained". So my input should be valid if and only if the whole pattern is repeating without anything between those occurrences (except of whitespaces). (Only one match or multiple matches next to each other with possible whitespaces between them).
valid examples:
1day
+1day
-1 day
+1day-1month
+1day +1month
+1day +1month
invalid examples:
###+1day+1month
+1day###+1month
+1day+1month###
###+1day+1month###
###+1day+1month###
I my case I can use matcher.find() method, this would do the trick but it will accept input like this: +1day###+1month
which is not valid for me.
Any ideas? This can be solved with multiple IF conditions and multiple checks for start and end indexes but I'm searching for elegant solution.
EDIT
The suggested regex in comments below ^\s*(([+-]*)(\d+)\s*([a-zA-Z]+)\s*)+$
will partially do the trick but if I use it in the code below it returns different result than the result I'm looking for.
The problem is that I cannot use (*my regex*)+
because it will match the whole thing.
The solution could be to match the whole input with ^\s*(([+-]*)(\d+)\s*([a-zA-Z]+)\s*)+$
and then use ([+-]*)(\\d+)\\s*([a-zA-Z]+)
with matcher.find()
and matcher.group(i)
to extract each match and his groups. But I was looking for more elegant solution.
Upvotes: 10
Views: 277
Reputation: 89557
You can proceed like this:
String p = "\\G\\s*(?:([-+]?)(\\d+)\\s*([a-z]+)|\\z)";
Pattern RegexCompile = Pattern.compile(p, Pattern.CASE_INSENSITIVE);
String s = "+1day 1month";
ArrayList<HashMap<String, String>> results = new ArrayList<HashMap<String, String>>();
Matcher m = RegexCompile.matcher(s);
boolean validFormat = false;
while( m.find() ) {
if (m.group(1) == null) {
// if the capture group 1 (or 2 or 3) is null, it means that the second
// branch of the pattern has succeeded (the \z branch) and that the end
// of the string has been reached.
validFormat = true;
} else {
// otherwise, this is not the end of the string and the match result is
// "temporary" stored in the ArrayList 'results'
HashMap<String, String> result = new HashMap<String, String>();
result.put("sign", m.group(1));
result.put("multiplier", m.group(2));
result.put("time_unit", m.group(3));
results.add(result);
}
}
if (validFormat) {
for (HashMap item : results) {
System.out.println("sign: " + item.get("sign")
+ "\nmultiplier: " + item.get("multiplier")
+ "\ntime_unit: " + item.get("time_unit") + "\n");
}
} else {
results.clear();
System.out.println("Invalid Format");
}
The \G
anchor matches the start of the string or the position after the previous match. In this pattern, it ensures that all matches are contigous. If the end of the string is reached, it's a proof that the string is valid from start to end.
Upvotes: 0
Reputation: 43743
This should work for you:
^\s*(([+-]*)(\d+)\s*([a-zA-Z]+)\s*)+$
First, by adding the beginning and ending anchors (^
and $
), the pattern will not allow invalid characters to occur anywhere before or after the match.
Next, I included optional whitespace before and after the repeated pattern (\s*
).
Finally, the entire pattern is enclosed in a repeater so that it can occur multiple times in a row ((...)+
).
On a side, note, I'd also recommend changing [+-]*
to [+-]?
so that it can only occur once.
Upvotes: 7
Reputation: 941
You can use String.matches
or Matcher.matches
in Java to match the entire region.
Java Example:
public class RegTest {
public static final Pattern PATTERN = Pattern.compile(
"(\\s*([+-]?)(\\d+)\\s*([a-zA-Z]+)\\s*)+");
@Test
public void testDays() throws Exception {
assertTrue(valid("1 day"));
assertTrue(valid("-1 day"));
assertTrue(valid("+1day-1month"));
assertTrue(valid("+1day -1month"));
assertTrue(valid(" +1day +1month "));
assertFalse(valid("+1day###+1month"));
assertFalse(valid(""));
assertFalse(valid("++1day-1month"));
}
private static boolean valid(String s) {
return PATTERN.matcher(s).matches();
}
}
Upvotes: 0
Reputation: 17831
You could use ^$
for that, to match the start/end of string
^\s*(?:([+-]?)(\d+)\s*([a-z]+)\s*)+$
https://regex101.com/r/lM7dZ9/2
See the Unit Tests
for your examples. Basically, you just need to allow the pattern to repeat and force that nothing besides whitespace occurs in between the matches.
Combined with line start/end matching and you're done.
Upvotes: 0