Reputation:
I am trying to parse the filename from paths. I have this:
my $filepath = "/Users/Eric/Documents/foldername/filename.pdf";
$filepath =~ m/^.*\\(.*[.].*)$/;
print "Linux path:";
print $1 . "\n\n";
print "-------\n";
my $filepath = "c:\\Windows\eric\filename.pdf";
$filepath =~ m/^.*\\(.*[.].*)$/;
print "Windows path:";
print $1 . "\n\n";
print "-------\n";
my $filepath = "filename.pdf";
$filepath =~ m/^.*\\(.*[.].*)$/;
print "Without path:";
print $1 . "\n\n";
print "-------\n";
But that returns:
Linux path:
-------
Windows path:Windowsic
ilename.pdf
-------
Without path:Windowsic
ilename.pdf
-------
I am expecting this:
Linux path:
filename.pdf
-------
Windows path:
filename.pdf
-------
Without path:
filename.pdf
-------
Can somebody please point out what I am doing wrong?
Thanks! :)
Upvotes: 2
Views: 4289
Reputation: 29854
Well, the answer to what is happening would be: various errors.
my $filepath = "/Users/Eric/Documents/foldername/filename.pdf";
$filepath =~ m/^.*\\(.*[.].*)$/;
print "Linux path:";
print $1 . "\n\n";
print "-------\n";
$filepath
doesn't have any \\
s in it, so it won't match and there's no $1
. You put /
s in it. Your expression would have to be:
# regular expression matches return their captures in a list context.
my ( $path ) = $filepath =~ m|/([^/.]*\.[^/.]*)$|;
print "Linux path:$path\n\n-------\n"; # little need to . a " string
my $filepath = "c:\\Windows\eric\filename.pdf";
$filepath =~ m/^.*\\(.*[.].*)$/;
print "Windows path:";
print $1 . "\n\n";
print "-------\n";
You're using double quotes, which, taking their cue from UNIX shells, are more active than single quote strings. Thus, you need to escape all your backslashes, like this:
my $filepath = "c:\\Windows\\eric\\filename.pdf";
or just use single quotes:
my $filepath = 'c:\Windows\eric\filename.pdf';
Actually, since perl understands '/'
for windows, this works too (but not for the regex.)
my $filepath = "c:/Windows/eric/filename.pdf";
As long as you fix it before handing it back to Windows.
my $filepath = "filename.pdf";
$filepath =~ m/^.*\\(.*[.].*)$/;
print "Without path:";
print $1 . "\n\n";
print "-------\n";
This didn't match, so $1
is still the last match. That's why it's repeated. But this points up the value of catching the captures instead of referring to $1
.
Upvotes: 2
Reputation: 19725
In this case, as others have said, the mistake is to do it by hand.
In addition to File::Basename
, you should take a look at File::Spec
and Path::Class
. They offer well-tested, cross-platform methods for handling files and directories. Path::Class
in particular provides helper methods for dealing with file and directory names that are foreign to the system the script lives on. It looks like that might come in handy here.
#!/usr/bin/env perl
use strict;
use warnings;
use Path::Class qw/file foreign_file/;
my $nix = "/Users/Eric/Documents/foldername/filename.pdf";
my $win = 'c:\\Windows\eric\filename.pdf'; # single quote to avoid escape issues
print file($nix)->basename(), "\n";
print foreign_file('Win32', $win)->basename(), "\n";
Upvotes: 7
Reputation: 1254
Perl provides this capability: http://perldoc.perl.org/File/Basename.html
You also need to be wary of string escapes - your Windows path string is being escaped on '\', '\f' and '\e' - it's been a while since I've dealt with Perl escapes, but I'm guessing the \e is also swallowing the 'r' after it. This explains the unexpected output.
Upvotes: 3
Reputation: 523774
Why not use File::Basename?
$name = basename($filepath)
print $name
The regex
m/^.*\\(.*[.].*)$/
# ^^
assumes a separator \
, so case 1 and 3 will never match. In case 2,
"c:\\Windows\eric\filename.pdf";
\e
and \f
are both special characters in Perl. So the code "correctly" returns Windows\eric\filename.pdf
as the filename. Remember to use \\
!
Upvotes: 4