FooManChooZE
FooManChooZE

Reputation: 31

Matching pattern only if it exists

I have the following code:

#!/usr/bin/perl
use warnings;
use strict;

my $SourceStr='Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: Prelim 3  Optional: Some stuff here';
#my $SourceStr='Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: Prelim 3';

my $RegEx = qr/Name: (.+)  Time: (.+)  State: (.+)  Optional: (.+?)(  |$)/;

if ($SourceStr =~ m/$RegEx/) {
   print "1=[$1]\n";
   print "2=[$2]\n";
   print "3=[$3]\n";
   print "4=[$4]\n";
}

When run with the first $SourceStr, it works as expected. However, for the second one that is commented out, is there a way to have $4 populated with the empty string?

First string results:

1=[Rob]
2=[11/2/2011 1:47:30 PM]
3=[3]
4=[Some stuff here]

Second string results: No match

Want:

1=[Rob]
2=[11/2/2011 1:47:30 PM]
3=[3]
4=[]

Upvotes: 3

Views: 2793

Answers (5)

Joel Berger
Joel Berger

Reputation: 20280

As documented here, it can be easier to deal with optional matches via named captures rather than numbered.

#!/usr/bin/env perl

use warnings;
use strict;

my @SourceStr = (
  'Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3  Optional: Some stuff here',
  'Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3',
);

my $RegEx = qr/Name: (?<name>.+?)  Time: (?<time>.+?)  State: (?<state>.+?)(?:  Optional: (?<optional>.+?))?(  |$)/;

foreach (@SourceStr) {
  print "Input '$_'\n";
  if ( /$RegEx/ ) {
     print "Name = '$+{name}'\n";
     print "Time = '$+{time}'\n";
     print "State = '$+{state}'\n";
     print "Optional = '$+{optional}'\n" if $+{optional};
  }
  print "\n";
}

In fact it makes it so easy, that its almost easier just to dump the %+ hash:

#!/usr/bin/env perl

use warnings;
use strict;

my @SourceStr = (
  'Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3  Optional: Some stuff here',
  'Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3',
);

my $RegEx = qr/Name: (?<name>.+?)  Time: (?<time>.+?)  State: (?<state>.+?)(?:  Optional: (?<optional>.+?))?(  |$)/;

use Data::Dumper;
foreach (@SourceStr) {
  print "Input '$_'\n";
  print Dumper \%+ if /$RegEx/;
}

Upvotes: 1

perreal
perreal

Reputation: 97948

You can use a more specific regex:

#!/usr/bin/perl
use warnings;
use strict;

my @SourceStrA=('Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3  Optional: Some stuff here',
                'Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3');

my $RegEx = qr!Name:\s*(\w+)\s*Time:\s*([\d/]*\s*[\d:]*)\s*State:\s*(\d+)\s*(?:Optional:\s*(.*))?!;

for my $SourceStr (@SourceStrA) {
  print "$SourceStr\n";
  if ($SourceStr =~ m/$RegEx/) {
    print "1=[$1]\n";
    print "2=[$2]\n";
    print "3=[$3]\n";
    print "4=[$4]\n" if defined $4; 
  }
}

Output:

Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3  Optional: Some stuff here
1=[Rob]
2=[11/2/2011 13:47:30]
3=[3]
4=[Some stuff here]
Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3
1=[Rob]
2=[11/2/2011 13:47:30]
3=[3]

Upvotes: 2

Kenosis
Kenosis

Reputation: 6204

Here's an option that produces your desired results:

#!/usr/bin/perl
use warnings;
use strict;

my $SourceStr = 'Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3  Optional: Some stuff here';

#my $SourceStr = 'Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3';

my $RegEx = qr/Name: (.+)  Time: (.+)  State: (.+?)(?:\s+Optional: (.+))?$/;

if ( $SourceStr =~ $RegEx ) {
    print "1=[$1]\n";
    print "2=[$2]\n";
    print "3=[$3]\n";
    print '4=[' . ( $4 // '' ) . "]\n";
}

Upvotes: 1

Orab&#238;g
Orab&#238;g

Reputation: 11992

The request seems weird, but here is a solution :

#!/usr/bin/perl
use warnings;
use strict;

my $SourceStr='Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3  Optional: Some stuff here';
#my $SourceStr='Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3';

my $RegEx = qr/Name: (.+)  Time: (.+)  State: (.+?)(?:  Optional: )?(.*)(  |$)/;

if ($SourceStr =~ m/$RegEx/) {
   print "1=[$1]\n";
   print "2=[$2]\n";
   print "3=[$3]\n";
   print "4=[$4]\n";
}

The trick was of course to use the (?: ) syntax to have an additional group without changing the place of $4. Also, using (?: Optional: (.*))? was incorrect (albeit more logical and robust), because it will imply that $4 will be undefined (and you need it to be an empty string), and the use strict pragma is printing a disturbing Use of uninitialized value... message.

Anyway, these requirements look more like an exercice than a real-life problem, aren't they ?

Upvotes: 1

Ry-
Ry-

Reputation: 224942

Maybe you should use a hash or something.

#!/usr/bin/perl
use warnings;
use strict;

#my $SourceStr='Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3  Optional: Some stuff here';
my $SourceStr='Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3';

my %Values;

while ($SourceStr =~ m/(\w+): (.+?)(?:  |$)/g) {
    $Values{$1} = $2;
}

if ($Values{Name} && $Values{Time} && $Values{State}) {
    print "1=$Values{Name}\n";
    print "2=$Values{Time}\n";
    print "3=$Values{State}\n";

    if (defined $Values{Optional}) {
        print "4=$Values{Optional}\n";
    } else {
        print "4=\n";
    }
}

Upvotes: 1

Related Questions