FrozenSoul90
FrozenSoul90

Reputation: 313

Perl: parsing csv with new line in data but without quotest

I have a simple perl script that parses csv

my $csv = Text::CSV->new(
    {   auto_diag             => 1,
        allow_loose_quotes    => 1,
        eol                   => "\r\n",
        sep_char              => '|',
        allow_unquoted_escape => 1,
        escape_char           => '\\',
        binary                => 1
    }
    )
    or die "" . Text::CSV->error_diag();

Now i have a wierd csv

01|10|Alpha|Test
01|20|Alpha
2|Test

Though it looks like 3 lines the last one is "Alpha\n2", but unfortunately my source system doesn't send them in quotes, is there any way i can successfully use the csv??

Upvotes: 1

Views: 348

Answers (1)

ikegami
ikegami

Reputation: 386706

Text::CSV is just a proxy for Text::CSV_PP or Text::CSV_XS. I can replicate the bug in both.

use strict;
use warnings;
use feature qw( say );

use Data::Dumper qw( );
use Text::CSV_XS qw( );   # Or Text::CSV_PP

sub dumper {
    local $Data::Dumper::Indent = 0;
    local $Data::Dumper::Terse  = 1;
    local $Data::Dumper::Useqq  = 1;
    return Data::Dumper::Dumper($_[0]);
}

my $csv = Text::CSV_XS->new({   # Or Text::CSV_PP
    auto_diag             => 2,
    allow_loose_quotes    => 1,
    eol                   => "\r\n",
    sep_char              => '|',
    allow_unquoted_escape => 1,
    escape_char           => '\\',
    binary                => 1
});

my $file = "01|10|Alpha|Test\r\n01|20|Alpha\n2|Test\r\n01|30|Alpha|Test\r\n";
open(my $fh, '<:raw', \$file) or die $!;
my $rows = $csv->getline_all($fh);
say dumper($rows);

Output:

[["01",10,"Alpha","Test"],["01",20,"Alpha"],["01",30,"Alpha","Test"]]

Expected output:

[["01",10,"Alpha","Test"],["01",20,"Alpha\n2","Test"],["01",30,"Alpha","Test"]]

If they never use quotes or escapes, just read CRLF-terminated lines and split on pipe.

use strict;
use warnings;
use feature qw( say );

use Data::Dumper qw( );

sub dumper {
    local $Data::Dumper::Indent = 0;
    local $Data::Dumper::Terse  = 1;
    local $Data::Dumper::Useqq  = 1;
    return Data::Dumper::Dumper($_[0]);
}

my $file = "01|10|Alpha|Test\r\n01|20|Alpha\n2|Test\r\n01|30|Alpha|Test\r\n";
open(my $fh, '<:raw', \$file) or die $!;
my @rows = do { local $/ = "\r\n"; map { [ split(/\|/, substr($_, 0, -2), -1) ] } <$fh> };
say dumper(\@rows);

Upvotes: 3

Related Questions