Dave221
Dave221

Reputation: 47

Parsing Interval Notation to Guava Range

I'm needing to parse a string containing standard interval notation (i.e. (8,100), [6,10), and so forth) into a Guava Range object. How would I go about doing that in Java? Is there a utility package that would parse the string into the components I would need to construct a Guava Range object?

Upvotes: 3

Views: 1839

Answers (2)

Riaan Schutte
Riaan Schutte

Reputation: 535

Had a similar problem and came up with this solution:

private static final Pattern INTERVAL_PATTERN = Pattern.compile("([\\[\\(])(-?∞?\\d*)(?:\\,|\\.\\.)(-?∞?\\d*)([\\]\\)])");

/**
 * Parses integer ranges of format (2,5], (2..5], (2,), [2..), [2..∞), [2,∞)
 *
 * @param notaiton The range notation to parse
 * @throws IllegalArgumentException if the interval is not in the defined notation format.
 */
public static Range<Integer> parseIntRange(@NonNull String notaiton) {
    Matcher matcher = INTERVAL_PATTERN.matcher(notaiton);
    if (matcher.matches()) {

        Integer lowerBoundEndpoint = Ints.tryParse(matcher.group(2));
        Integer upperBoundEndpoint = Ints.tryParse(matcher.group(3));
        if (lowerBoundEndpoint == null && upperBoundEndpoint == null) {
            return Range.all();
        }
        boolean lowerBoundInclusive = matcher.group(1).equals("[");
        boolean upperBoundInclusive = matcher.group(4).equals("]");

        //lower infinity case
        if (lowerBoundEndpoint == null) {
            if (upperBoundInclusive) {
                return Range.atMost(upperBoundEndpoint);
            } else {
                return Range.lessThan(upperBoundEndpoint);
            }
        } //upper infinity case
        else if (upperBoundEndpoint == null) {
            if (lowerBoundInclusive) {
                return Range.atLeast(lowerBoundEndpoint);
            } else {
                return Range.greaterThan(lowerBoundEndpoint);
            }
        }

        //non infinity cases
        if (lowerBoundInclusive) {
            if (upperBoundInclusive) {
                return Range.closed(lowerBoundEndpoint, upperBoundEndpoint);
            } else {
                return Range.closedOpen(lowerBoundEndpoint, upperBoundEndpoint);
            }

        } else {
            if (upperBoundInclusive) {
                return Range.openClosed(lowerBoundEndpoint, upperBoundEndpoint);
            } else {
                return Range.open(lowerBoundEndpoint, upperBoundEndpoint);
            }
        }
    } else {
        throw new IllegalArgumentException(notaiton + " is not a valid range notation");
    }
}

Unit tests:

@Test
public void testParseIntRange_infinites_parsesOK() {
    assertThat(NumberUtils.parseIntRange("(,2)"), is(Range.lessThan(2)));
    assertThat(NumberUtils.parseIntRange("(2,)"), is(Range.greaterThan(2)));
    assertThat(NumberUtils.parseIntRange("(,2]"), is(Range.atMost(2)));
    assertThat(NumberUtils.parseIntRange("[2,)"), is(Range.atLeast(2)));
    assertThat(NumberUtils.parseIntRange("(..2)"), is(Range.lessThan(2)));
    assertThat(NumberUtils.parseIntRange("(2..)"), is(Range.greaterThan(2)));
    assertThat(NumberUtils.parseIntRange("(..2]"), is(Range.atMost(2)));
    assertThat(NumberUtils.parseIntRange("[2..)"), is(Range.atLeast(2)));

    assertThat(NumberUtils.parseIntRange("(∞,2)"), is(Range.lessThan(2)));
    assertThat(NumberUtils.parseIntRange("(2,∞)"), is(Range.greaterThan(2)));
    assertThat(NumberUtils.parseIntRange("(∞,2]"), is(Range.atMost(2)));
    assertThat(NumberUtils.parseIntRange("[2,∞)"), is(Range.atLeast(2)));
    assertThat(NumberUtils.parseIntRange("(∞..2)"), is(Range.lessThan(2)));
    assertThat(NumberUtils.parseIntRange("(2..∞)"), is(Range.greaterThan(2)));
    assertThat(NumberUtils.parseIntRange("(∞..2]"), is(Range.atMost(2)));
    assertThat(NumberUtils.parseIntRange("[2..∞)"), is(Range.atLeast(2)));

    assertThat(NumberUtils.parseIntRange("(-∞,2)"), is(Range.lessThan(2)));
    assertThat(NumberUtils.parseIntRange("(-∞,2]"), is(Range.atMost(2)));
    assertThat(NumberUtils.parseIntRange("(-∞,]"), is(Range.all()));
}

@Test
public void testParseIntRange_parsesOK() {
    assertThat(NumberUtils.parseIntRange("(-2,3)"), is(Range.open(-2, 3)));
    assertThat(NumberUtils.parseIntRange("(-2,-1)"), is(Range.open(-2, -1)));
    assertThat(NumberUtils.parseIntRange("(2,3)"), is(Range.open(2, 3)));
    assertThat(NumberUtils.parseIntRange("[2,3)"), is(Range.closedOpen(2, 3)));
    assertThat(NumberUtils.parseIntRange("(2,3]"), is(Range.openClosed(2, 3)));
    assertThat(NumberUtils.parseIntRange("[2,3]"), is(Range.closed(2, 3)));

    assertThat(NumberUtils.parseIntRange("(2..3)"), is(Range.open(2, 3)));
    assertThat(NumberUtils.parseIntRange("[2..3)"), is(Range.closedOpen(2, 3)));
    assertThat(NumberUtils.parseIntRange("(2..3]"), is(Range.openClosed(2, 3)));
    assertThat(NumberUtils.parseIntRange("[2..3]"), is(Range.closed(2, 3)));
}

@Test
public void testParseIntRange_WithInvalidStrings_failsAccordingly() {
    String[] invalidParams = {
        null, "", "(4 5", "[2,3] ", " [2,3]", "[2,3][2,3]", "[a,b]", " [2..3]", "[2.3]",
        "[3...4]", "(3 4)", "[2]", "(5,1)", "ab[2,4]", "[2,4]cd", "(2,-2)", "(2,2)"
    };
    for (String invalidParam : invalidParams) {
        try {
            NumberUtils.parseIntRange(invalidParam);
            fail("Parsing '" + invalidParam + "' did not fail");
        } catch (IllegalArgumentException ex) {
        }
    }
}

Upvotes: 2

Alexis C.
Alexis C.

Reputation: 93892

If we look at the pattern, the interval either starts with a '[' or a '(', then it is followed by at least one digit, followed by a comma, again one or more digit and finished by either ']' or ')'.

So the regular expression will look like this :

^[\\(|\\[](\\d+),(\\d+)[\\)|\\]]$

Here it is decomposed :

^
 [\\(|\\[] -> start either with `'['` or `'('` (we need to escape the special characters with `\\`)
 (\\d+) -> followed by one or more digit that we capture in a group
 , -> followed by a comma
 (\\d+) -> followed again by one or more digit that we capture in another group
 [\\)|\\]] -> and that finishes either with `']'` or `')'`
$

^ and $ assert that the all string matched the expression and not only a part of it.

So we have the regex, yay!

Now we need to create a Pattern instance from it, so that will be able to fetch a matcher from. Finally we check if the string matches the pattern and we grab the corresponding groups

Pattern p = Pattern.compile("^[\\(|\\[](\\d+),(\\d+)[\\)|\\]]$");
Matcher m = p.matcher("(0,100)");

if(matcher.matches()) {
    int lowerBound = Integer.parseInt(matcher.group(1));
    int upperBound = Integer.parseInt(matcher.group(2));
    System.out.println(lowerBound + "_" + upperBound);
}

The following outputs 0_100.

Now the final step, get the first and last character and create the appropriate range from it; putting it all together:

class RangeFactory {

    private static final Pattern p = Pattern.compile("^[\\(|\\[](\\d+),(\\d+)[\\)|\\]]$");

    public static Range from(String range) {
        Matcher m = p.matcher(range);
        if(m.matches()) {
            int length = range.length();

            int lowerBound = Integer.parseInt(m.group(1));
            int upperBound = Integer.parseInt(m.group(2));

            if(range.charAt(0) == '(') {
                if(range.charAt(length - 1) == ')') {
                    return Range.open(lowerBound, upperBound);
                }
                return Range.openClosed(lowerBound, upperBound);
            } else {
                if(range.charAt(length - 1) == ')') {
                    return Range.closedOpen(lowerBound, upperBound);
                }
                return Range.closed(lowerBound, upperBound);
            }
        }
        throw new IllegalArgumentException("Range " + range + " is not valid.");
    }
}

Here's some test cases :

List<String> ranges =
    Arrays.asList("(0,100)", "[0,100]", "[0,100)", "(0,100]", "", "()", "(0,100", "[,100]", "[100]");

for(String range : ranges) {
    try {
        System.out.println(RangeFactory.from(range));
    } catch (IllegalArgumentException ex) {
        System.out.println(ex);
    }
}

which outputs:

(0‥100)
[0‥100]
[0‥100)
(0‥100]
java.lang.IllegalArgumentException: Range  is not valid.
java.lang.IllegalArgumentException: Range () is not valid.
java.lang.IllegalArgumentException: Range (0,100 is not valid.
java.lang.IllegalArgumentException: Range [,100] is not valid.
java.lang.IllegalArgumentException: Range [100] is not valid.

You can ameliorate the regex (to accept ranges with infinite bounds, etc.), but it should give you a good starting point.

Hope it helps! :)

Upvotes: 4

Related Questions