Reputation: 15
I have a CSV file that I need to enclose each value in quotes, where each value is a string. I'm getting unexpected quotes when concatenating
$outline = "";
$line = "John,Smith,[email protected],000-0000";
@parts = split (',',$line);
for $part (@parts) {
$part = '"' . $part . '"';
if ($outline eq "") {
$outline = $part; # reconstruct line
} else {
$outline = $outline . "," . $part;
}
}
$outline = $outline . "," . '"' . $parts[0] . " " . $parts[1] . '"';
print "$outline\n";
I expected:
"John","Smith","jsmith.net","000-0000","John Smith"
but I got:
"John","Smith","jsmith.net","000-0000",""John" "Smith""
Why am I getting the extra quotes?
Thanks for the help.
Upvotes: 0
Views: 165
Reputation: 67890
A lot of practical solutions have been provided, I however wanted to address your question: Why does this happen?
The reason you are getting the double double quotes is that you are actually changing the elements of @parts
. Inside a for
loop, the elements are aliased to the loop arguments, so any changes to them directly are made on the "real" values as well. Consider the following:
my @foos = 1 .. 3;
for my $foo (@foos) {
$foo += 1;
}
print "@foos"; # prints 2 3 4
So when you change $part
in your code, the array @parts
is also changed, and becomes like this (Data::Dumper
output):
$VAR1 = [
'"John"',
'"Smith"',
'"[email protected]"',
'"000-0000"'
];
And from that point on, you cannot put together the string "John"
and "Smith"
without first removing the quotes again.
I also prepared a solution using Text::CSV
, and I see ThisSuitIsBlackNot has already done so, so you can take a look at his answer for a practical solution.
For a more lightweight solution you can use Text::ParseWords
. This, like Text::CSV
, has the benefit of handling quoted delimiters.
use Text::ParseWords;
my $line = 'John,Smith,[email protected],000-0000';
my @parts = quotewords(",", 0, $line);
push @parts, "@parts[0,1]";
print join ",", map qq("$_"), @parts;
Upvotes: 6
Reputation: 24073
I always use Text::CSV
when working with delimited data. It allows you to easily change delimiters, quoting behavior, and escape characters, and handles fields that contain the delimiter, which is difficult to handle on your own (although this isn't applicable to your example).
The following will quote all of the fields in the file input.csv
and write the results to STDOUT
:
#!/usr/bin/perl
use strict;
use warnings;
use Text::CSV;
my $csv = Text::CSV->new({
binary => 1,
auto_diag => 1,
always_quote => 1,
eol => $/
}) or die "Cannot use CSV: " . Text::CSV->error_diag;
open my $fh, '<', 'input.csv' or die "input.csv: $!";
while (my $row = $csv->getline($fh)) {
$csv->print(\*STDOUT, $row);
}
close $fh;
input.csv
John,Smith,[email protected],000-0000
Jane,Doe,[email protected],000-0000
Output
"John","Smith","[email protected]","000-0000"
"Jane","Doe","[email protected]","000-0000"
Upvotes: 2
Reputation: 107060
There's no reason to use a for
loop to string together the various parts. If you can use split
, you can use join
:
my $line = "John,Smith,[email protected],000-0000";
my @parts = split /,/, $line; # Split the line on commas
my $new_line = join q(","), @parts; # Separate out the parts with quote-comma-quote
my $new_line = qq("$new_line"); # Add pre and post quotes
The q(...)
is a quote-like operator that acts as a single quote. The qq(...)
is a quote-like operator that acts as double quotes. It's a bit easier to understand qq("$line")
and q(",")
instead of "\"$line"\"
or '","'
.
I'm using join to join all the parts with ","
. That handles the separation in the middle of $new_line
, but doesn't handle the beginning and ending quote. Thus, I need a second command line to add the pre and post quotes.
Upvotes: 0
Reputation: 29854
$part
in the foreach
loop aliases each element of @parts
. So you're actually storing back into the array, the strings you wrapped with quotes.
Try using Data::Dumper
and dump @parts
at the bottom of each loop.
use Data::Dumper;
...
print Dumper( \@parts );
Upvotes: 0