user1610458
user1610458

Reputation: 299

Multiple matches with delimiter

this is my regex:

([+-]*)(\\d+)\\s*([a-zA-Z]+)

The thing is, I would like to match given input but it can be "chained". So my input should be valid if and only if the whole pattern is repeating without anything between those occurrences (except of whitespaces). (Only one match or multiple matches next to each other with possible whitespaces between them).

valid examples:

1day
+1day
-1 day
+1day-1month
+1day +1month
   +1day  +1month    

invalid examples:

###+1day+1month
+1day###+1month
+1day+1month###
###+1day+1month###
###+1day+1month###

I my case I can use matcher.find() method, this would do the trick but it will accept input like this: +1day###+1month which is not valid for me.

Any ideas? This can be solved with multiple IF conditions and multiple checks for start and end indexes but I'm searching for elegant solution.

EDIT

The suggested regex in comments below ^\s*(([+-]*)(\d+)\s*([a-zA-Z]+)\s*)+$ will partially do the trick but if I use it in the code below it returns different result than the result I'm looking for. The problem is that I cannot use (*my regex*)+ because it will match the whole thing.

The solution could be to match the whole input with ^\s*(([+-]*)(\d+)\s*([a-zA-Z]+)\s*)+$and then use ([+-]*)(\\d+)\\s*([a-zA-Z]+)with matcher.find() and matcher.group(i) to extract each match and his groups. But I was looking for more elegant solution.

Upvotes: 10

Views: 277

Answers (4)

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89557

You can proceed like this:

String p = "\\G\\s*(?:([-+]?)(\\d+)\\s*([a-z]+)|\\z)";

Pattern RegexCompile = Pattern.compile(p, Pattern.CASE_INSENSITIVE);

String s = "+1day 1month";

ArrayList<HashMap<String, String>> results = new ArrayList<HashMap<String, String>>(); 

Matcher m = RegexCompile.matcher(s);
boolean validFormat = false;        

while( m.find() ) {
    if (m.group(1) == null) {
        // if the capture group 1 (or 2 or 3) is null, it means that the second
        // branch of the pattern has succeeded (the \z branch) and that the end
        // of the string has been reached. 
        validFormat = true;
    } else {
        // otherwise, this is not the end of the string and the match result is
        // "temporary" stored in the ArrayList 'results'
        HashMap<String, String> result = new HashMap<String, String>();
        result.put("sign", m.group(1));
        result.put("multiplier", m.group(2));
        result.put("time_unit", m.group(3));
        results.add(result);
    }
}

if (validFormat) {
    for (HashMap item : results) {
        System.out.println("sign: " + item.get("sign")
                         + "\nmultiplier: " + item.get("multiplier")
                         + "\ntime_unit: " + item.get("time_unit") + "\n");
    }
} else {
    results.clear();
    System.out.println("Invalid Format");
}

The \G anchor matches the start of the string or the position after the previous match. In this pattern, it ensures that all matches are contigous. If the end of the string is reached, it's a proof that the string is valid from start to end.

Upvotes: 0

Steven Doggart
Steven Doggart

Reputation: 43743

This should work for you:

^\s*(([+-]*)(\d+)\s*([a-zA-Z]+)\s*)+$

First, by adding the beginning and ending anchors (^ and $), the pattern will not allow invalid characters to occur anywhere before or after the match.

Next, I included optional whitespace before and after the repeated pattern (\s*).

Finally, the entire pattern is enclosed in a repeater so that it can occur multiple times in a row ((...)+).

On a side, note, I'd also recommend changing [+-]* to [+-]? so that it can only occur once.

Online Demo

Upvotes: 7

Loki
Loki

Reputation: 941

You can use String.matches or Matcher.matches in Java to match the entire region.

Java Example:

public class RegTest {

    public static final Pattern PATTERN = Pattern.compile(
            "(\\s*([+-]?)(\\d+)\\s*([a-zA-Z]+)\\s*)+");

    @Test
    public void testDays() throws Exception {
        assertTrue(valid("1 day"));
        assertTrue(valid("-1 day"));
        assertTrue(valid("+1day-1month"));
        assertTrue(valid("+1day -1month"));
        assertTrue(valid("   +1day  +1month   "));

        assertFalse(valid("+1day###+1month"));
        assertFalse(valid(""));
        assertFalse(valid("++1day-1month"));
    }

    private static boolean valid(String s) {
        return PATTERN.matcher(s).matches();
    }
}

Upvotes: 0

F.P
F.P

Reputation: 17831

You could use ^$ for that, to match the start/end of string

^\s*(?:([+-]?)(\d+)\s*([a-z]+)\s*)+$

https://regex101.com/r/lM7dZ9/2

See the Unit Tests for your examples. Basically, you just need to allow the pattern to repeat and force that nothing besides whitespace occurs in between the matches.

Combined with line start/end matching and you're done.

Upvotes: 0

Related Questions