Reputation: 1387
Anyone be able to help me with this regex please? I need an expression that will match the line that does not contain the "Created" string at the end. This script is being used to read the headings on some source code.
$string = "* JAN-01-2001 bugsbunny 1234 Created Module";
#$string = "* DEC-12-2012 bugsbunny 5678 Modified Module";
if($string =~ /^\*\s+(\w\w\w-\d\d-\d\d\d\d)\s+(\w+)\s+(\d+)\s+(?!Created)/){
print "$1\n$2\n$3\n$4\n";
} else {
print "no match\n";
}
When using the first $string definition, I need the match to fail because it has the word "Created" at the end of it. When using the second $string definition, it should pass and I need to pull out the date($1), user($2), change number($3) and description($4).
The expression above is not working. Any advice please?
Upvotes: 4
Views: 7007
Reputation: 1
$string = "* JAN-02-2001 bugsbunny 1234 Created Module";
$string = "* DEC-12-2012 bugsbunny 5678 Modified Module";
if($string =~ /^\*\s+(\w\w\w-\d\d-\d\d\d\d)\s+(\w+)\s+(\d+)\s+([^Created]|Modified)\s+(\w+)/){
print "$1\n$2\n$3\n$4\n";
}
else {
print "no match\n";
}
Upvotes: 0
Reputation: 6204
Another option is to split
and test the description for 'Created':
use strict;
use warnings;
#my $string = "* JAN-01-2001 bugsbunny 1234 Created Module";
my $string = "* DEC-12-2012 bugsbunny 5678 Modified Module";
my ( undef, $date, $user, $change, $desc ) = split ' ', $string, 5;
if ( $desc !~ /^Created/ ) {
print "$date\n$user\n$change\n$desc\n";
}
else {
print "no match\n";
}
Output:
DEC-12-2012
bugsbunny
5678
Modified Module
Upvotes: 0
Reputation:
Another trick you can use to make this work is using a (?>...)
group that disables backtracking. Disabling backtracking means that any expression using +
or *
will greedily eat up anything it finds, and it will never go back to try something else if the pattern fails. This means that all of the whitespace before "Created" is eaten up, so the (?!Created)
part of the regex always occurs at the exact right point.
if($string =~ /^(?>\*\s+(\w\w\w-\d\d-\d\d\d\d)\s+(\w+)\s+(\d+)\s+)(?!Created)/){
print "$1\n$2\n$3\n";
} else {
print "no match\n";
}
This also has the added bonus of making your regex much faster.
This approach doesn't work for every kind of problem, because many regexes need to be able to backtrack in order to match correctly. But it will work great for this one.
Upvotes: 1
Reputation: 336218
Close:
/^\*\s+(\w{3}-\d{2}-\d{4})\s+(\w+)\s+(\d+)\s+(?!.*Created)/
You need to allow any number of non-newline characters before Created
, therefore the .*
.
Otherwise, the regex would simply back up by one character when matching \s+
, so the following text would be " Created"
, and then (?!Created)
would match.
See it here; notice how the match stops one space before Created
.
Upvotes: 4