Ian
Ian

Reputation: 2021

Apache2 with PDF and PHP - "This file does not start with "%PDF-"

I have been trying to find the reason for this error for weeks now - and I have come up a blank. The system uses PHP to generate dynamic .pdf files.

I have three servers: Dev (Win7 with Apache2), Test (Ubuntu 10.4 with nginx), and Live (Ubuntu 10.10 with nginx). All are running php5 and the system I have developed - same code. Equivalent, same config.

I have many browsers I have tested things with: DevIE (win7, IE8), DevFF (Win7 Firefox 3.5), DevSaf (win, Safari), LaptopFF (WinXP, Firfox 3.5), Laptop IE(WinXP, IE8 Test (Ubuntu FF3.5), and users (mostly IE8 on Win 7 and Win XP).

The users are reporting intermittent problems. It fails, and then the repeat the request and it succeeds.

When it fails....

The log of the generated PDF is shown, sending the right sort of size reply (500KB to 1.8MB) with a 200 OK result. This is sometimes followed about 10 seconds later with a repeat of the same URL - but this generates the log-on screen (again 200 OK reply), but only 2K in size. The implication is that it was requested without the cookie.

Adobe Reader tries to display the log-on page, with the inevitable "This file does not start with "%PDF-" error message.

Except for when I try with the laptop and IE8 - then it fails with show source showing a 4 line html file with an empty body!

The system has been working for over a year - and only started failing with a change of production server about 2 months ago. The test version was not changed at this time, but started to fail also.

I have tried all sorts of headers, but nothing I have tried makes any difference. The current set of headers is:

header('Content-Disposition: inline; filename="'.$this->pdfFilename().'"');
header('Content-type: application/pdf');
header("Pragma: public");
$when = date('r',time()+20);  // expire in 20 seconds
header("Expires: $when");

I've tried replacing inline with attachment. Adding and removing all sorts of no-cache headers. All to no avail.

The PDF is requested in a new window, by JavaScript - and is followed 8 seconds later by a refresh. I have tested without the new window, and without the refresh - no change.

I have has a few (small) PDFs served by the Dev server. So I have raised every limit I can think of. Now it always fails.

So I have a Windows Apache2.2 server that fails when browsed from the same machine and succeeds when browsed from other machines in Firefox.

There is no proxy or cache mechanism involved other than that in the browsers.

Has anyone any ideas about what might be going wrong? As I said, I have been testing and eliminating things for nearly 4 weeks now, on and off, and I have not yet even identified the failing component.

Upvotes: 3

Views: 1260

Answers (2)

Till
Till

Reputation: 22416

This is really tough to troubleshoot - for starters, (please excuse my bluntness, but) this a prime example of what a pipeline should not look like:

  • Three different operating systems.
  • Probably at least two different versions of PHP.
  • Two different webservers.

But anyway, a few general hints on debugging PHP:

  • make sure to enable error_log and log_errors in php.ini (set display_errors = Off)
  • use the most verbose error_reporting
  • set access_log and error_log in nginx.
  • crank up log level in nginx (I'm guessing you use php-cgi or php-fpm, so you should be able to see what status the backend emits when the download attemp fails).

Furthermore:

  1. You haven't shared how the PDF is generated - are you sure all libraries used here are the same or at least somewhat the same across all systems?
  2. In any case, just to be sure I would save the PDF on the server before it is offered to download. This allows you to troubleshoot the actual file — to see if the PDF generation actually worked.
  3. Since you're saving the PDF, I'd see about putting it in a public folder, so you can see if you can just redirect to it after it's generated. And only if this works, then I'd work on a force-download kind of thing.
  4. I would replicate the production environment in all stages. ;-) You need your dev server to be exactly like the production environment. For your own workstation, I'd recommend a VM (e.g. through Virtualbox with Ubuntu 10.10).

Let me know this gets you somewhere and reply with updates. :-)

Update:

I'd investigate these two headers:

header("Cache-Control: no-cache, must-revalidate"); // HTTP/1.1
header("Expires: Sat, 26 Jul 1997 05:00:00 GMT"); // Date in the past

Definitely helps with cache busting.

Upvotes: 3

SteAp
SteAp

Reputation: 11999

These are the headers, which finally worked in a similar situation in one of my apps:

header("Pragma: public");
header("Expires: 0");
header("Cache-Control: must-revalidate, post-check=0, pre-check=0");
header("Cache-Control: private",false);
header( "Content-Type: application/pdf" );
header("Content-Disposition: inline; filename=\"YourPDF_" . time() . ".pdf\";");
header("Content-Transfer-Encoding: binary");
header("Content-Length: ". strlen( $pdfData ) );

I added the time() code to make the filename change each time, so that it likely passes all proxies.

From time to time but seldom, the problem re-appears. Then, we ask our clients to download the file using the browser context menu.

PS: The app uses ezPDF found here: http://www.ros.co.nz/pdf/

Upvotes: 0

Related Questions