perlnoob
perlnoob

Reputation: 33

Perl regex match has multiple empty lines

I'm trying to parse through a string with perl and put the matches into an array.

Ex. "FUNC1(VALUE1) VALUE1, VALUE2, FUNC2(FUNC1(VALUE3)) VALUE3, VALUE4, FUNC3(VALUE5) VALUE5"

Output:

FUNC1(VALUE1) VALUE1
VALUE2
FUNC2(FUNC1(VALUE3)) VALUE3
VALUE4
FUNC3(VALUE5) VALUE5

My code:

my $in = "FUNC1(VALUE1) VALUE1, VALUE2, FUNC2(FUNC1(VALUE3)) VALUE3, VALUE4, FUNC3(VALUE5) VALUE5";

my @cols = ($in =~ /((?&full_m)),?
(?(DEFINE)
            (?<full_m>(?&full_f)|(?&word))
            (?<full_f>(?&func)\s(?&word))
            (?<func>(?&word)\((?&worf)\))
            (?<worf>(?&func)|(?&word))
            (?<word>\s*\w+\s*)
        )/gx);
print "$in\n";

my $count = 1;
foreach (@cols) {
    print "$count: $_\n";
    ++$count;
}

The problem is that I get the matches but also 5 empty matches after.

1: FUNC1(VALUE1) VALUE1
2: 
3: 
4: 
5: 
6: 
7:  VALUE2
8: 
9: 
10: 
11: 
12: 
13:  FUNC2(FUNC1(VALUE3)) VALUE3
14: 
15: 
16: 
17: 
18: 
19:  VALUE4
20: 
21: 
22: 
23: 
24: 
25:  FUNC3(VALUE5) VALUE5
26: 
27: 
28: 
29: 
30: 

Upvotes: 2

Views: 90

Answers (1)

user557597
user557597

Reputation:

This the same thing except it just store group 1 into the col's array.

my $in = "FUNC1(VALUE1) VALUE1, VALUE2, FUNC2(FUNC1(VALUE3)) VALUE3, VALUE4, FUNC3(VALUE5) VALUE5";
my @cols;
while ($in =~ /((?&full_m)),?(?(DEFINE)(?<full_m>(?&full_f)|(?&word))(?<full_f>(?&func)\s(?&word))(?<func>(?&word)\((?&worf)\))(?<worf>(?&func)|(?&word))(?<word>\s*\w+\s*))/gx)
{
   push @cols, $1;
}
print "$in\n";

my $count = 1;
foreach (@cols) {
    print "$count: $_\n";
    ++$count;
}

Output

FUNC1(VALUE1) VALUE1, VALUE2, FUNC2(FUNC1(VALUE3)) VALUE3, VALUE4, FUNC3(VALUE5) VALUE5
1: FUNC1(VALUE1) VALUE1
2:  VALUE2
3:  FUNC2(FUNC1(VALUE3)) VALUE3
4:  VALUE4
5:  FUNC3(VALUE5) VALUE5

To better see the regex, formatting is needed

 ( (?&full_m) )                # (1)
 ,?
 (?(DEFINE)
      (?<full_m>                    # (2 start)
           (?&full_f) 
        |  (?&word)
      )                             # (2 end)
      (?<full_f>                    # (3 start)
           (?&func) \s (?&word)
      )                             # (3 end)
      (?<func>                      # (4 start)
           (?&word) \( (?&worf) \)
      )                             # (4 end)
      (?<worf>                      # (5 start)
           (?&func) 
        |  (?&word)
      )                             # (5 end)
      (?<word> \s* \w+ \s* )        # (6)
 )

Upvotes: 1

Related Questions