ryekayo
ryekayo

Reputation: 2421

Regexing in Java for (\d{4},\d{4})

I am trying to read a file and look only for numeric values in parenthesis. So an example of this would be:

(0000,0002)
(0000,0003)
(0002,0005)

I have created a regex that will search for this in java as shown:

public String matchDICOMTags = "^[(][\\d{4},][\\d{4}][)]$";
public Pattern pattern = Pattern.compile(matchDICOMTags);

However in my method, when it comes to this line of code:

        Matcher m = pattern.matcher(dcmObj.toString());

It does not continue with the code. I am starting to think it is a problem with my regex but I am not certain. Can someone tell me if my pattern is correct?

Upvotes: 1

Views: 1141

Answers (2)

Bohemian
Bohemian

Reputation: 424993

Here's a one-liner to get a list of String[] pairs:

List<String[]> pairs = Arrays.stream(input.split("[\n\r]+"))
        .map(s -> s.replaceAll(".*\\((\\d{4},\\d{4})\\).*", "$1"))
        .filter(s -> s.length() == 9)
        .map(s -> s.split(","))
        .collect(Collectors.toList());

Some test code:

String input = "foo(0000,0002)bar\n(0003,0004) bar\nfoo(0005,0006)";
Arrays.stream(input.split("[\n\r]+"))
        .map(s -> s.replaceAll(".*\\((\\d{4},\\d{4})\\).*", "$1"))
        .filter(s -> s.length() == 9)
        .map(s -> s.split(","))
        .map(Arrays::toString)
        .forEach(System.out::println);

Output:

[0000, 0002]
[0003, 0004]
[0005, 0006]

Upvotes: 1

Adam
Adam

Reputation: 36703

The \d{4} patterns for digits should not be inside [] as this causes them to be literal character matches. Also I believe the ^ and $ markers are not necessary, it works with or without them. Also , does not need to be inside a [] block.

Move them outside

"[(]\\d{4},\\d{4}[)]";

Test

String test = "other stuff (0000,0002) foo \n(0000,0003) bar \n(0002,0005)baz";
Pattern pattern = Pattern.compile("[(](\\d{4}),(\\d{4})[)]");
Matcher matcher = pattern.matcher(test);
while (matcher.find()) {
    System.out.println(String.format("(%s,%s)", matcher.group(1), matcher.group(2)));
}

Output

(0000,0002)
(0000,0003)
(0002,0005)

Upvotes: 3

Related Questions