Andrei Doanca
Andrei Doanca

Reputation: 251

perl split interesting behavior

can somebody explain this weird behavior:

I hava path in a string and I want to split it for each backslash

my $path = "D:\Folder\AnotherFolder\file.txt";

my @folders = split('\', $path);

in the case above it won't work not even if escaping the backslash like this:

my @folders = split('\\', $path);

but in the case of a regexp it will work:

my @folders = split( /\\/, $path);

why is so?

Upvotes: 7

Views: 4517

Answers (4)

psxls
psxls

Reputation: 6925

When split is used in the form of split STRING and not split REGEX, the string is being converted into a regex. In your case split '\\' will be converted to split /\/ since the first backslash is considered an escape character.

The correct way to do it is split '\\\\' which will be translated to split /\\/.

Upvotes: 2

Borodin
Borodin

Reputation: 126722

One of the neater ways to extract the elements of a path is to extract all sequences of characters other than a path separator.

use strict;
use warnings;

my $path = 'D:\Folder\AnotherFolder\file.txt';
my @path = $path =~ m([^/\\]+)g;

print "$_\n" for @path;

output

D:
Folder
AnotherFolder
file.txt

Upvotes: 2

TLP
TLP

Reputation: 67900

I think amon gave the best literal answer to your question in his comment:

more explicitly: strings and regexes have different rules for escaping. If a string is used in place of a regex, the string literals suffer from double escaping

Meaning that split '\\' uses a string and split /\\/ uses a regex.

As a practical answer, I wanted to add this:

Perhaps you should consider using a module suited for splitting paths. File::Spec is a core module in Perl 5. And also, you have to escape backslash in a double quoted string, which you have not done. You can also use single quotes, which looks a bit better in my opinion.

use strict;
use warnings;
use Data::Dumper;
use File::Spec;

my $path = 'D:\Folder\AnotherFolder\file.txt';  # note the single quotes
my @elements = File::Spec->splitdir($path);
print Dumper \@elements;

Output:

$VAR1 = [
          'D:',
          'Folder',
          'AnotherFolder',
          'file.txt'
        ];

Upvotes: 6

DJG
DJG

Reputation: 6543

If you look at the documentation by running:

perldoc -f split

you will see three forms of arguments that split can take:

split /PATTERN/,EXPR,LIMIT
split /PATTERN/,EXPR
split /PATTERN/

This means that even when you pass split a string as the first argument, perl is coercing it into a regex.

If we look at the warnings we get when trying to do something like this in re.pl:

$ my $string_with_backslashes = "Hello\\there\\friend";
Hello\there\friend
$ my @arry = split('\\', $string_with_backslashes);
Compile error: Trailing \ in regex m/\/ at (eval 287) line 6.

we see that first, '\\' is interpolated as a backslash escape followed by an actual backslash, which evaluates to a single backslash.

split then puts the backslash we gave it, and coerces it to a regex as if we had written:

$ my @arry = split(/\/, $string_with_backslashes);

which doesn't work because there is only a single backslash which is interpreted as simply escaping the forward slash after it (without having a terminating /) to show that the regex has ended.

Upvotes: 2

Related Questions