Radislav
Radislav

Reputation: 2983

How to parse regex

Does anybody know how to parse a string below to get these two strings: [Test1][Test2] and [Test3][Test4].

STRING:

Hello [Test1][Test2] world] [Test3][Test4] this is test].

Upvotes: 1

Views: 414

Answers (3)

Birei
Birei

Reputation: 36252

Using a perl flavour regex:

m/\[\S+/g

Test:

Content of script.pl:

use warnings;
use strict;

## Read all data after __DATA__ filehandle.
while ( <DATA> ) { 

    ## Save in array '@matches' any characters from an opening 
    ## square bracket until a blank found.
    ## 'g' flag means to do it many times in same line.
    my @matches = m/\[\S+/g;

    ## Print to output. Each match in a line.
    printf qq[%s\n], join qq[\n], @matches;
}

__DATA__
Hello [Test1][Test2] world] [Test3][Test4] this is test].

Run the script:

perl script.pl

Result:

[Test1][Test2]
[Test3][Test4]

Upvotes: 0

slartidan
slartidan

Reputation: 21566

You will have to do a loop to get a dynamic number of matches (which I suppose that you want to get).

I used the pattern .*?((?:\[.*?\])+)(.*). The first matching group will find the desired strings, the second matching group will always find "the rest", that you will have to parse again.

The construct "(?: ... )" is a non capturing group, which will not produce matching groups (in Java regular expression syntax).

Here is a short Java sample:

public static void main(String[] args) {

    // define regular expression
    Pattern pattern = Pattern.compile(".*?((?:\\[.*?\\])+)(.*)");

    // iterate for each match
    Matcher matcher = pattern.matcher("Hello [Test1][Test2] world] [Test3][Test4] this is test].");
    while (matcher.matches()) {
        String text = matcher.replaceFirst(matcher.group(2));
        System.out.println("Found " + matcher.group(1));
        matcher = pattern.matcher(text);
    }
}

That will output:

Found [Test1][Test2]
Found [Test3][Test4]

Sorry if this is kind of complicated, please let me/us know if you need a simpler example...

Upvotes: 1

Konrad Dzwinel
Konrad Dzwinel

Reputation: 37903

Try this: (\[[a-zA-Z0-9]+\]){2} .

Upvotes: 1

Related Questions