masterial
masterial

Reputation: 2216

How do I split a string into an array by comma but ignore commas inside double quotes?

I have a line:

$string = 'Paul,12,"soccer,baseball,hockey",white';

I am try to split this into @array that has 4 values so

print $array[2];

Gives

soccer,baseball,hockey

How do I this? Help!

Upvotes: 0

Views: 12615

Answers (7)

vaishali
vaishali

Reputation: 335

try this

  @array=($string =~ /^([^,]*)[,]([^,]*)[,]["]([^"]*)["][,]([^']*)$/);

the array will contains the output which expected by you.

Upvotes: 0

Use this regex: m/("[^"]+"|[^,]+)(?:,\s*)?/g;

The above regular expression globally matches any word that starts with a comma or a quote and then matches the remaining word/words based on the starting character (comma or quote).

Here is a sample code and the corresponding output.

my $string = "Word1, Word2, \"Commas, inbetween\", Word3, \"Word4Quoted\", \"Again, commas, inbetween\"";
my @arglist = $string =~ m/("[^"]+"|[^,]+)(?:,\s*)?/g;
map { print $_ , "\n"} @arglist;

Here is the output:

Word1
Word2
"Commas, inbetween"
Word3
"Word4Quoted"
"Again, commas, inbetween"

Upvotes: 3

Joel Berger
Joel Berger

Reputation: 20280

In response to how to do it with Text::CSV(_PP). Here is a quick one.

#!/usr/bin/perl

use strict;
use warnings;

use Text::CSV_PP;
my $parser = Text::CSV_PP->new();

my $string = "Paul,12,\"soccer,baseball,hockey\",white";

$parser->parse($string);
my @fields = $parser->fields();

print "$_\n" for @fields;

Normally one would install Text::CSV or Text::CSV_PP through the cpan utility.

To work around your not being able to install modules, I suggest you use the 'pure Perl' implementation so that you can 'install' it. The above example would work assuming you copied the text of Text::CSV_PP source into a file named CSV_PP.pm in a folder called Text created in the same directory as your script. You could also put it in some other location and use the use lib 'directory' method as discussed previously. See here and here to see other ways to get around install restriction using CPAN modules.

Upvotes: 5

oylenshpeegul
oylenshpeegul

Reputation: 3424

The standard module Text::ParseWords will do this as well.

my @array = parse_line(q{,}, 0, $string);

Upvotes: 7

Purandaran
Purandaran

Reputation: 74

$string = "Paul,12,\"soccer,baseball,hockey\",white";

1 while($string =~ s#"(.?),(.?)"#\"$1aaa$2\"#g);

@array = map {$_ =~ s/aaa/ /g; $_ =~ s/\"//g; $_} split(/,/, $string);

$" = "\n";

print "$array[2]";

Upvotes: -1

Nikhil Jain
Nikhil Jain

Reputation: 8352

use strict;
use warning;
#use Data::Dumper;

my $string = qq/Paul,12,"soccer,baseball,hockey",white/;

#split string into three parts
my ($st1, $st2, $st3) = split(/,"|",/, $string);
#output: st1:Paul,12 st2:soccer,baseball,hockey  st3:white  

#split $st1 into two parts
my ($st4, $st5) = split(/,/,$st1);

#push records into array
push (my @test,$st4, $st5,$st2, $st3 ) ;

#print Dumper \@test;
print "$test[2]\n";

output:

soccer,baseball,hockey 

#$VAR1 = [
#          'Paul',
#         '12',
#          'soccer,baseball,hockey',
#          'white'
#        ];

Upvotes: -1

singingfish
singingfish

Reputation: 3167

Just use Text::CSV. As you can see from the source, getting CSV parsing right is quite complicated:

sub _make_regexp_split_column {
    my ($esc, $quot, $sep) = @_;

    if ( $quot eq '' ) {
        return qr/([^\Q$sep\E]*)\Q$sep\E/s;
    }

   qr/(
        \Q$quot\E
            [^\Q$quot$esc\E]*(?:\Q$esc\E[\Q$quot$esc\E0][^\Q$quot$esc\E]*)*
        \Q$quot\E
        | # or
        [^\Q$sep\E]*
       )
       \Q$sep\E
    /xs;
}

Upvotes: 11

Related Questions