Reputation: 21443
My script takes in a filepath, and I want to append a directory to the end of the path. The issue is I want to be agnostic of whether the argument has a trailing slash or not. So for example:
$ perl myscript.pl /path/to/dir
/path/to/dir/new
$ perl myscript.pl /path/to/dir/
/path/to/dir/new
I tried $path =~ s/\/?$/\/new/g
, but that results in a double /new
if a slash is present:
$ perl myscript.pl /path/to/dir
/path/to/dir/new/new
$ perl myscript.pl /path/to/dir
/path/to/dir/new
What's wrong?
Upvotes: 0
Views: 53
Reputation: 53478
Because /g
is 'global' and will match multiple times:
#!/usr/bin/env perl
use strict;
use warnings;
#turn on debugging
use re 'debug';
my $path = '/path/to/dir/';
$path =~ s/\/?$/\/new/g;
print $path;
After the first replacement, the regex engine has 'left' the "end of line" marker, and doesn't need to match the optional /
. So matches a second time.
E.g.:
Compiling REx "/?$"
Final program:
1: CURLY {0,1} (5)
3: EXACT </> (0)
5: SEOL (6)
6: END (0)
floating ""$ at 0..1 (checking floating) minlen 0
Matching REx "/?$" against "/path/to/dir/"
Intuit: trying to determine minimum start position...
doing 'check' fbm scan, [0..13] gave 13
Found floating substr ""$ at offset 13 (rx_origin now 12)...
(multiline anchor test skipped)
try at offset...
Intuit: Successfully guessed: match at offset 12
12 <path/to/dir> </> | 1:CURLY {0,1}(5)
EXACT </> can match 1 times out of 1...
13 <path/to/dir/> <> | 5: SEOL(6)
13 <path/to/dir/> <> | 6: END(0)
Match successful!
Matching REx "/?$" against ""
Intuit: trying to determine minimum start position...
doing 'check' fbm scan, [13..13] gave 13
Found floating substr ""$ at offset 13 (rx_origin now 13)...
(multiline anchor test skipped)
Intuit: Successfully guessed: match at offset 13
13 <path/to/dir/> <> | 1:CURLY {0,1}(5)
EXACT </> can match 0 times out of 1...
13 <path/to/dir/> <> | 5: SEOL(6)
13 <path/to/dir/> <> | 6: END(0)
Match successful!
Matching REx "/?$" against ""
Intuit: trying to determine minimum start position...
doing 'check' fbm scan, [13..13] gave 13
Found floating substr ""$ at offset 13 (rx_origin now 13)...
(multiline anchor test skipped)
Intuit: Successfully guessed: match at offset 13
13 <path/to/dir/> <> | 1:CURLY {0,1}(5)
EXACT </> can match 0 times out of 1...
13 <path/to/dir/> <> | 5: SEOL(6)
13 <path/to/dir/> <> | 6: END(0)
This is because $
is a zero width position anchor. And so is \/?
if there's no matches. Once the pattern has been consumed all the way up to the trailing /
and replaced.. then the regex engine continues (because you told it to with /g
) and find just $
left, because that's still the end of line. And that's still a valid match to replace.
But why not instead use File::Spec
:
#!/usr/bin/env perl
use strict;
use warnings;
use File::Spec;
use Data::Dumper;
my $path = '/path/to/dir/';
my @dirs = File::Spec->splitdir($path);
print Dumper \@dirs;
$path = File::Spec->catdir(@dirs, "new" );
print $path;
This provides you with a platform independent way to split and join path elements, and doesn't rely on regex matching - which there's various ways it could break (such as the one you found).
Upvotes: 2
Reputation: 6626
Drop the /g
modifier:
$path =~ s/\/?$/\/new/
works fine.
You only want to modify add one "new" at the end, so having a /g
modifier makes no sense.
Also, note that you can use different delimiters for your regex:
$path =~ s{ /? $}{/new}x;
is a little bit clearer.
Upvotes: 1