jww
jww

Reputation: 102346

Perl function to normalize path with spaces?

I'm in maintenance mode and I'm working with a Perl script that is run on Apple, Linux, Windows and Unix. Some Apple and Linux and most Windows have spaces in the path. On Windows, the long file name needs quotes. On Apple and Linux the space needs a backslash. If there's no space, then nothing needs to be done.

Perl's File::Copy and File::Spec are aware of system differences and it abstracts them for different file systems. Looking through the other File functions, I don't see what is used to normalize or canonicalize a pathname which adds quotes, slashes, moves quotes around, etc. as required.

Perl version requirements are v5.10. So I should be able to expect at least v5.10 without any trouble.

What is the Perl function to normalize or canonicalize path with spaces?


Here's an oversimplified example on Windows:

my $testcat = catfile(catdir("\"C:\\Program Files\"", "My Program"), "test.txt");
print "Test cat: $testcat\n";

The result is the following. Notice the quoting is not right and the path separator is wrong.

Test cat: "C:/Program Files"/My Program/test.txt

Here is what I expert on a Windows system (or an error):

Test cat: "C:\Program Files\My Program\test.txt"

There are similar questions, but they all seem to be one-off. For example How to handle filenames with spaces? says to manually add quotes for Windows. I'm looking for the Perl routines to do it.

Upvotes: 3

Views: 1955

Answers (2)

Borodin
Borodin

Reputation: 126742

I'm not sure why you think you need a library function to wrap something in double quotes?

You're mixing in the quotes/escapes far too early. They're only needed in certain circumstances when they are part of a longer string that will be treated as a space-separated list of substrings. The most obvious example being a command line for cmd/bash

While you're working with the string in your program you need just the plain path string without any decoration. Once you've built your path, create your command line (or whatever) with quotes around it, and it should all work

I've never been able to get the escape character for Windows cmd (which is circumflex ^) to work reliably, so I always wrap any strings that contain space characters in double quotes. That works on Windows and any flavour of Unix, including OSX

Here's an example using the code in your question. Note that there's no need to be so careful about using catdir and catfile appropriately: unless you're building a root directory like C:\ they behave identically on systems where there is no syntactical distinction between files and directories () which includes all the platforms you mention in your question

use strict;
use warnings 'all';

use File::Spec::Functions qw/ catfile /;

my $testcat = catfile('C:\Program Files', 'My Program', 'test.txt');

print qq{Test cat: "$testcat"\n};

system qq{type "$testcat"};

output

Test cat: "C:\Program Files\My Program\test.txt"
TESTCAT CONTENTS



Update

Here's another example showing how path segments that have reached your program can be unquoted before they're used. I've defined three scalar variables. Some or all of those may have originated outside your program, while others may be defined like this, as string literals. The point is that $root is enclosed in unwanted double quotes; it is an invalid path segment and won't work if you pass it to catfile

So I've written a little subroutine unquote and applied it to all three as we're pretending we don't know which of the segments are quoted and which are not. As you can see from the output, it removes the quotes from $root but leaves the other two strings untouched. Now they're all valid and okay to pass to catfile

The output shows that catfile returns Test cat: C:\Program Files\My Program\test.txt which is what we want. Now suppose we want to type it, so we need to create the command line

type "C:\Program Files\My Program\test.txt"

In the context of the command line, the double quotes are necessary to delimit the path string, but they not part of the path

Again, as you can see, the call to system works fine. My file contains TESTCAT CONTENTS, and that is what my program prints

I hope that helps?

use strict;
use warnings 'all';
use feature 'say';

use File::Spec::Functions qw/ catfile /;

my ($root, $dir, $file) = ( '"C:\Program Files"', 'My Program', 'test.txt');

print <<END;
Original:
Root: $root
Dir:  $dir
File: $file

END


unquote($_) for $root, $dir, $file;


print <<END;
Unquoted:
Root: $root
Dir:  $dir
File: $file

END


my $testcat = catfile($root, $dir, $file);

say "Full path: $testcat";

my $cmd = qq{type "$testcat"};
say "Command is:\n$cmd\n";

system $cmd;


sub unquote {
    $_[0] =~ s/\A"([^"]*)"\z/$1/;
    $_[0];
}

output

Original:
Root: "C:\Program Files"
Dir:  My Program
File: test.txt

Unquoted:
Root: C:\Program Files
Dir:  My Program
File: test.txt

Full path: C:\Program Files\My Program\test.txt
Command is:
type "C:\Program Files\My Program\test.txt"

TESTCAT CONTENTS

Upvotes: 2

Jon Ericson
Jon Ericson

Reputation: 21525

I'm not sure how you managed to get the output you describe. On Windows I get:

Test cat: "C:\Program Files"\My Program\test.txt

On OSX, I get:

Test cat: "C:\Program Files"/My Program/test.txt

Which OS and version of Perl are you using? Is it possible you left out some relevant parts of your script.


Your example shows confusion about quoting and escaping strings in Perl. It might help to break it down into smaller pieces to see what's going on and put the pieces together later:

print "\"C:\\Program Files\""

"C:\Program Files"

This is probably what you expected. It uses raw interpolation to build the string you want to use. Note: you can simplify this statement by using non-interpolated strings:

print '"C:\Program Files"'

Appending the directory, you start using File::Spec:

use File::Spec::Functions;
print catdir('"C:\Program Files"', "My Program")

"C:\Program Files"\My Program

This is where things get funky. catdir expects a list of directories, but you provided a string that is almost certainly not a directory as the first item in the list.

Given you prepended the directory with the C:\ volume, there's a good chance you actually want to use the catpath function:

  • catpath()

    Takes volume, directory and file portions and returns an entire path. Under Unix, $volume is ignored, and directory and file are concatenated. A '/' is inserted if need be. On other OSes, $volume is significant.

      $full_path = File::Spec->catpath( $volume, $directory, $file );
    

The resulting string would not be directly useable on the command line if there are spaces because Perl makes some rather Unixish assumptions. But as answers to the related question point out, you can insert double quotes after constructing the path. As it turns out double quotes escape protect spaces on OSX and Linux; you don't need to escape each individual space.

Alternatively, use a module designed for accomplishing whatever you are trying to do. File::Copy does a good job of addressing cross-platform concerns, for instance.

Upvotes: 3

Related Questions