Reputation: 5423
I'm trying to go paperless with all my utility bills, and that means downloading the statements from Suddenlink instead of stuffing the paper ones into a filing cabinet.
I've used WWW::Mechanize before and I've liked it (Why did I try to do this stuff in LWP for so long?), and so I've went ahead and gotten a workable script ready. I can log in, navigate to the page that lists the pdf links, and loop through those.
I do the following:
my $pdf = $mech->clone();
for my $link ($mech->find_all_links(url_regex => qr/viewstatement\.html/)) {
[removed for brevity]
unless (-f "Suddenlink/$year/$date.pdf") {
$pdf->get($link->url);
$pdf->save_content("Suddenlink/$year/$date.pdf", binary => 1);
When I compare one of these files with the same downloaded via Chrome, it's apparent what the problem is. Both files are identical on up to about 8-24 kbytes (it varies), but the Chrome pdf will be complete, and the perl-script pdf will be truncated.
It's late, and there's nothing obviously wrong with the code. Google is turning up a few problems with save_content(), but not anything like what I'm getting.
What am I doing wrong?
Upvotes: 2
Views: 436
Reputation: 27183
...[S]et
$mech->agent_alias()
to something. [Suddenlink is] doing a connection reset whenever they see a weird user agent string. – John O 18 hours ago
Upvotes: 2