John Lee
John Lee

Reputation: 1387

Perl Regex Regular Expression match string except, not match string

Anyone be able to help me with this regex please? I need an expression that will match the line that does not contain the "Created" string at the end. This script is being used to read the headings on some source code.

$string = "* JAN-01-2001   bugsbunny     1234     Created Module";
#$string = "* DEC-12-2012   bugsbunny     5678     Modified Module";
if($string =~ /^\*\s+(\w\w\w-\d\d-\d\d\d\d)\s+(\w+)\s+(\d+)\s+(?!Created)/){
    print "$1\n$2\n$3\n$4\n";
} else {
    print "no match\n";
}

When using the first $string definition, I need the match to fail because it has the word "Created" at the end of it. When using the second $string definition, it should pass and I need to pull out the date($1), user($2), change number($3) and description($4).

The expression above is not working. Any advice please?

Upvotes: 4

Views: 7007

Answers (4)

user4365230
user4365230

Reputation: 1

$string = "* JAN-02-2001   bugsbunny     1234     Created Module";
$string = "* DEC-12-2012   bugsbunny     5678     Modified Module";
if($string =~ /^\*\s+(\w\w\w-\d\d-\d\d\d\d)\s+(\w+)\s+(\d+)\s+([^Created]|Modified)\s+(\w+)/){
    print "$1\n$2\n$3\n$4\n";
}
else {
    print "no match\n";
}

Upvotes: 0

Kenosis
Kenosis

Reputation: 6204

Another option is to split and test the description for 'Created':

use strict;
use warnings;

#my $string = "* JAN-01-2001   bugsbunny     1234     Created Module";
my $string = "* DEC-12-2012   bugsbunny     5678     Modified Module";

my ( undef, $date, $user, $change, $desc ) = split ' ', $string, 5;

if ( $desc !~ /^Created/ ) {
    print "$date\n$user\n$change\n$desc\n";
}
else {
    print "no match\n";
}

Output:

DEC-12-2012
bugsbunny
5678
Modified Module

Upvotes: 0

user1919238
user1919238

Reputation:

Another trick you can use to make this work is using a (?>...) group that disables backtracking. Disabling backtracking means that any expression using + or * will greedily eat up anything it finds, and it will never go back to try something else if the pattern fails. This means that all of the whitespace before "Created" is eaten up, so the (?!Created) part of the regex always occurs at the exact right point.

if($string =~ /^(?>\*\s+(\w\w\w-\d\d-\d\d\d\d)\s+(\w+)\s+(\d+)\s+)(?!Created)/){
    print "$1\n$2\n$3\n";
} else {
    print "no match\n";
}

This also has the added bonus of making your regex much faster.

This approach doesn't work for every kind of problem, because many regexes need to be able to backtrack in order to match correctly. But it will work great for this one.

Upvotes: 1

Tim Pietzcker
Tim Pietzcker

Reputation: 336218

Close:

/^\*\s+(\w{3}-\d{2}-\d{4})\s+(\w+)\s+(\d+)\s+(?!.*Created)/

You need to allow any number of non-newline characters before Created, therefore the .*.

Otherwise, the regex would simply back up by one character when matching \s+, so the following text would be " Created", and then (?!Created) would match.

See it here; notice how the match stops one space before Created.

Upvotes: 4

Related Questions