user8065512
user8065512

Reputation:

Splitting a String with Perl

I was following along with this tutorial on how to split strings when I came across a quote that confused me.

Words about Context

Put to its normal use, split is used in list context. It may also be used in scalar context, though its use in scalar context is deprecated. In scalar context, split returns the number of fields found, and splits into the @_ array. It's easy to see why that might not be desirable, and thus, why using split in scalar context is frowned upon.

I have the following script that I've been working with:

#!/usr/bin/perl
use strict;
use warnings;
use v5.24;

doWork();

sub doWork {
    my $str = "This,is,data";
    my @splitData = split(/,/, $str);
    say $splitData[1];
    return; 
} 

I don't fully understand how you would use split on a list.

From my understanding, using the split function on my $str variable is frowned upon? How then would I go about splitting a string with the comma as the delimiter?

Upvotes: 1

Views: 7698

Answers (2)

ikegami
ikegami

Reputation: 385819

The frowned-upon behaviour documented by that passage was deprecated at least as far back as 5.8.8 (11 years ago) and was removed from Perl in 5.12 (7 years ago).

The passage documents that

my $n = split(...);

is equivalent to

my $n = do { @_ = split(...); @_ };            # <5.12

The assignment to @_ is unexpected. This type of behaviour is called "surprising action at a distance", and it can result in malfunctioning code. As such, before 5.12, using split in scalar context was frowned-upon. Since 5.12, however,

my $n = split(...);

is equivalent to

my $n = do { my @anon = split(...); @anon };   # ≥5.12

The surprising behaviour having been removed, it's no longer frowned-upon to use split in scalar context for the reason stated in the passage you quoted.

It should probably still be avoided, not just for backwards compatibility, but because there are far better ways of counting the number of substrings. I would use the following:

my $n = 1 + tr/,//;    # Faster than: my $n = split(/,/, $_, -1);

You are using split in list context, so it does not exercise the frowned-upon behaviour, no matter what version of Perl you use. In other words, your usage is fine.

It's fine unless you are trying to handle CSV data, that is. In that case, you should be using Text::CSV_XS.

use Text::CSV_XS qw( );

my $csv = Text::CSV_XS->new({ auto_diag => 2, binary => 1 });

while (my $row = $csv->getline($fh)) { ... }   # Parsing CSV

for (...) { $csv->say($fh, $row); }            # Generating CSV

Upvotes: 2

user7818749
user7818749

Reputation:

Calling split in scalar context isn't very useful. It effectively returns the number of separators plus one, and there are better ways of doing that.

For example,

my $str = "This,is,data";
my $splitData = split(/,/, $str);
say $splitData;

will print 3 as it counts the substrings after the split.

split in scalarf context used to also return the split parts in @_, but that frowned-upon behaviour was removed because it's rather unexpected.

Using it as an array is perfect.

my $str = "This,is,data";

the above line is a single string.

my @splitData = split(/,/, $str);

You are now splitting the $str into an array, or a list of values. So effectively you are now sitting with @splitData which is in fact:

"This" "is" "string"

So you can either use them all, say @splitData or use each of them as a scalar @splitData[1] which we never use as it is always better to write it as $splitData[1]

The tutorial says it nicely. Use split on a string to create a list of substrings.

You can then obviously automatically assign each of the list values in a loop without having to print each list value.

my $str = "This,is,data";
my @splitData = split(/,/, $str);
foreach $value(@splitData) {
               say "$value\n"
 }

This basically re-assigns $splitData[0], $splitData[1] etc... to $value as scalar.

Upvotes: 1

Related Questions