Mux
Mux

Reputation: 51

Java regex to match only words in double square brackets

I'm trying to do this: I have this kind of text (i.e. a file):

[[dadasd sadasd sdsd ad asddd]] [[dasdsd]] dsdsd [[dsdas]] ... [[dd ssas dd]]

I want only the sentences between double square brackets. How can I solve this with java?

//This one is not working:
    String patternStr = "(.*)\\[\\[(.*)\\]\\](.*)";
        Pattern pattern = Pattern.compile(patternStr);
        Matcher matcher = pattern.matcher("");

        // Set the input
        matcher.reset("[[sdasd]] ddd [[ddssssssssssss]] vvvddd [[dd]] asdasda [[asdsa]] ");
...

Thanks in advance

Upvotes: 5

Views: 6166

Answers (3)

Kirill Polishchuk
Kirill Polishchuk

Reputation: 56162

You regex simply modified: \[\[(.*?)\]\] or this one: \[\[([^\[\]]*)\]\]

So result sentences are in group No 1

Result:

enter image description here

Or use this regex: (?<=\[\[)[^\[\]]*(?=\]\]) with lookahead and lookbehind.

Result:

enter image description here

Upvotes: 5

wjans
wjans

Reputation: 10115

You can do something like this:

Pattern pattern = Pattern.compile("\\[+(.*?)\\]+");
Matcher matcher = pattern.matcher("[[sdasd]] ddd [[ddssssssssssss]] vvvddd [[dd]] asdasda [[asdsa]] ");
while(matcher.find()) {
    System.out.println(matcher.group(1));
}

It will output:

sdasd
ddssssssssssss
dd
asdsa

Upvotes: 3

Joey
Joey

Reputation: 354406

You want the following regex:

\[\[(.+?)\]\]

which then translates to a Java string as:

\\[\\[(.+?)\\]\\]

PowerShell test:

PS Home:\> [regex]::Matches("[[sdasd]] ddd [[ddssssssssssss]] vvvddd [[dd]] asdasda [[asdsa]] ", '\[\[(.+?)\]\]') | ft -auto

Groups                               Success Captures             Index Length Value
------                               ------- --------             ----- ------ -----
{[[sdasd]], sdasd}                      True {[[sdasd]]}              0      9 [[sdasd]]
{[[ddssssssssssss]], ddssssssssssss}    True {[[ddssssssssssss]]}    14     18 [[ddssssssssssss]]
{[[dd]], dd}                            True {[[dd]]}                40      6 [[dd]]
{[[asdsa]], asdsa}                      True {[[asdsa]]}             55      9 [[asdsa]]

Upvotes: 5

Related Questions