Reputation: 33
Creating perl scritps to auto download CSV's from various billers websites but I'm having problems taking the data from $mech->content() into something I can parse line by line for some reason. The content is a multi line CSV file,
#!/usr/bin/perl
use WWW::Mechanize;
use IO::Socket::SSL qw();
my $mech = WWW::Mechanize->new();
...stuff...
my $data=$mech->content();
my (@lines)=split(/\n?\r/,$data);
print "lines=".@lines."\n---\n@lines\n---\n";
write_file("tmp.csv",$data);
for(my $i=0;$i<@lines;$i++){
...work that's done that depends on each
line being represented as an element of
an array...
}
Originally I assigned $mech->content() directly to @lines, tried a few other things like $mech->content( raw => 1 ), as you see above I tried splitting it with \n or \r. Browser shows the csv file as text/plain, Quirks mode, UTF-8 Running file tmp.csv shows it's ASCII text and is multiline.
What am I doing wrong, and what's that right way to do this?
Upvotes: 1
Views: 239
Reputation: 164809
The problem is here:
my (@lines)=split(/\n?\r/,$data);
You have the newline regex backwards. It's \r?\n
, but it's safer to write \015?\012
for the literal characters because \r
and \n
can be different on some systems.
Your for loop can be better written as:
for my $line (@lines) {
However, you generally don't want to process entire files as an array. What you're doing can use a tremendous amount of memory. Instead it's better to first save it to disk and read the CSV file line by line.
use autodie;
$mech->get( $uri, ':content_file' => "test.csv" );
open my $fh, "test.csv";
while( my $line = <$fh> ) {
...
}
But don't do your own CSV parsing. It's much faster and less buggy to use Text::CSV_XS.
Upvotes: 1