ION
ION

Reputation: 177

Perl Match anything inside brackets placing each instance into groups

I am trying to grab anything from inside Brackets (making sure to match only the first closing bracket ])

I am using

$text=~ /\[(\w+)]/gmi

to find all 7 matches in this file.

Testing Testing Testing
[Test]
[Test][TestTest][PPPP]
[Test] [TestTest] [PPPP]
Test]

It only grabs the first instance of Test in each line even when multiline matching is set /m.
I am trying to return each string that is inside brackets and nothing else (so for example not picking up Test]).

I tried this RegEx expression inside a regex web parser, Regex Web Parser. Which says that it should return all 7 matches.

use strict;
use warnings;
use Win32::OLE;
use Win32::OLE::Enum;
use Win32::OLE qw(in with);
use Win32::OLE::Const;
use Win32::OLE::Const 'Microsoft Word';
use Win32::OLE; $Win32::OLE::Warn = 3;  

my (@req_array,$document,$paragraphs,$paragraph,$enumerate,$style,$text,$word,$oldfile);

    eval {$word = Win32::OLE->GetActiveObject('Word.Application')}; 
    die "Word not installed" if $@; 

    unless (defined $word) { $word = Win32::OLE->new('Word.Application', 

    sub {$_[0]->Quit;}) or die "Oops, cannot start Word"; } 
    $word->Activate; 
    $word->{visible} = 1;

    #$oldfile =~ m!^(.+?)/([^/]+)$!;
    #my $dir = $1 . '/';
    #my $name = $2;
    #$word->ChangeFileOpenDirectory($dir);

    my $doc = $word->Documents->Open('C:\Users\n\Desktop\test.doc');

    print $ARGV[0] . "\n";

    $paragraphs = $doc->Paragraphs();

    $enumerate = new Win32::OLE::Enum($paragraphs);
    while(defined($paragraph = $enumerate->Next()))
    {
        $style = $paragraph->{Style}->{NameLocal};
        $text = $paragraph->{Range}->{Text};
        if($text=~ /\[(\w+)]/gmi)
        {
        print $1 . "\n";
        }

    }

Upvotes: 1

Views: 250

Answers (1)

Sobrique
Sobrique

Reputation: 53478

If you capture part of a regex, and do so with the 'g' flag - as you're doing - the result is an array, not a string.

Like this:

#!/usr/bin/perl

use strict;
use warnings;

my @matches;
while ( <DATA> ) {
   push ( @matches, m,\[(\w+)\],g );
}

print join ("\n", @matches );


__DATA__
Testing Testing Testing
[Test]
[Test][TestTest][PPPP]
[Test] [TestTest] [PPPP]
Test]

Regarding multi-line strings in comments - this code works, and should work ok with your code. $1 is defined each time your run the pattern match, and is the first capture group. You can access others with $2 etc.

But this style of matching I think falls down when you're working on an arbitrary number of possible matches, which is where an array would be apt.

#!/usr/bin/perl

use strict;
use warnings;

my $multi_line_str = q{Testing Testing Testing
[Test]
[Test][TestTest][PPPP]
[Test] [TestTest] [PPPP]
Test]};

print join ("\n", $multi_line_str =~ m,\[(\w+)\],gmi  );

Upvotes: 3

Related Questions