Lampione
Lampione

Reputation: 1608

How can I get non-matching groups using a Matcher in Java?

I'm trying to write a java regex to catch some groups of words from a String using a Matcher.

Say i got this string: "Hello, we are @happy@ to see you today".

I would like to get 2 group of matches, one having

Hello, we are
to see you today

and the other

happy

So far, I was only able to match the word between the @s using this Pattern:

Pattern p = Pattern.compile("@(.+?)@");

I've read about negative lookahead and lookaround, played a bit with it but without success.

I assume I should do some sort of negation of the regex so far, but I couldn't come up with anything.

Any help would be really appreciated, thank you.

Upvotes: 0

Views: 772

Answers (3)

Andreas
Andreas

Reputation: 159086

From comment:

I may incur in a string where I got more than one instances of words wrapped by @, such as "@Hello@ kind @stranger@"

From comment:

I need to apply some different style format to both the text inside and outside.

Since you need to apply different stylings, the code need to process each block of text separately, and needs to know if the text is inside or outside a @..@ section.

Note, in the following code, it will silently skip the last @, if there is an odd number of them.

String input = ...
for (Matcher m = Pattern.compile("([^@]+)|@([^@]+)@").matcher(input); m.find(); ) {
    if (m.start(1) != -1) {
        String outsideText = m.group(1);
        System.out.println("Outside: \"" + outsideText + "\"");
    } else {
        String insideText = m.group(2);
        System.out.println("Inside: \"" + insideText + "\"");
    }
}

Output for input = "Hello, we are @happy@ to see you today"

Outside: "Hello, we are "
Inside: "happy"
Outside: " to see you today"

Output for input = "@Hello@ kind @stranger@"

Inside: "Hello"
Outside: " kind "
Inside: "stranger"

Output for input = "This @text@ has unpaired @ characters"

Outside: "This "
Inside: "text"
Outside: " has unpaired "
Outside: " characters"

Upvotes: 2

Alanpatchi
Alanpatchi

Reputation: 1199

Is this solution fine?

    Pattern pattern =
            Pattern.compile("([^@]+)|@([^@]*)@");

    Matcher matcher =
            pattern.matcher("Hello, we are @happy@ to see you today");


    List<String> notBetween = new ArrayList<>();  // not surrounded by @
    List<String> between = new ArrayList<>();  // surrounded by @

    while (matcher.find()) {
        if (Objects.nonNull(matcher.group(1))) notBetween.add(matcher.group(1));
        if (Objects.nonNull(matcher.group(2))) between.add(matcher.group(2));
    }

    System.out.println("Printing group 1");
    for (String string :
            notBetween) {
        System.out.println(string);
    }

    System.out.println("Printing group 2");
    for (String string :
            between) {
        System.out.println(string);
    }

Upvotes: 1

totok
totok

Reputation: 1500

The best I could do is splitting in 3 groups, then merging the group 1 and 4 :

(^.*)(\@(.+?)\@)(.*)

Test it here

EDIT: Taking remarks from the comments :

(^[^\@]*)(?:\@(.+?)\@)([^\@]*)

Thanks to @Lino we don't capture the useless group with @ anymore, and we capture anything except @, instead of any non whitespace character in the 1st and 2nd groups.

Test it here

Upvotes: 1

Related Questions