David Cain
David Cain

Reputation: 17333

PDF print a password-protected page (ideally by making wkhtmltopdf use a cookie)

I'm trying to print a password-protected page to PDF with wkhtmltopdf, but I can't load a (working) cookie, meaning I always just print the "log in" page.

Saving the cookie post-login

The following code works as expected: If I log in, I can view the proper pages, whether I've loaded from the cookie, or entered my login information:

class PrintPages(object):
    def __init__(self):
        ...
        self.browser = mechanize.Browser()
        self.cj = mechanize.MozillaCookieJar()
        self.browser.set_cookiejar(self.cj)
        self.login("cookies.txt")

    def login(self, cookie_jar):
        """ Log in, save cookie if doesn't exist. Otherwise, load cookie. """
        if os.path.isfile(cookie_jar):
            self.cj.load(cookie_jar, ignore_discard=True, ignore_expires=True)
        else:
            self.browser.open(self.login_url)

            self.browser.select_form(name="loginform")
            self.browser["username"] = self.username
            self.browser["password"] = getpass.getpass()

            self.browser.submit()
            self.cj.save(cookie_jar, ignore_discard=True, ignore_expires=True)

(cookies.txt)

    # Netscape HTTP Cookie File
    # http://www.netscape.com/newsref/std/cookie_spec.html
    # This is a generated file!  Do not edit.

sub.example.com  FALSE   /   TRUE        JSESSIONID  B8307A77925DB287B0346C728BBF8F24

However, telling either wget or wkhtmltopdf to load the cookies gives me the login page.

$ wget -p --load-cookies cookies.txt sub.example.com/page.html
$ wkhtmltopdf --cookie-jar cookies.txt sub.example.com/page.html page.pdf

What gives? Ideally any solution that allows me to print to PDF would be ideal, but I'm curious what's going on here.


I'm using:

Upvotes: 4

Views: 2032

Answers (1)

Giles Smith
Giles Smith

Reputation: 1972

I don't have a solution to your specific problem of the cookies, but what we have done to print pdfs with permissions is:

  1. Expose a separate view with no log in authentication.
  2. Create a single use token for the pdf to be generated.
  3. In the view with out authentication, ensure that the key is correct, and has not been use yet. If the token is valid, then return the html to convert into a pdf.
  4. If your view needs to know which user is requesting the pdf (to customise the page in some way) you can store the user id along with the token in the database.

We are looking for a better way of dong this, but it works for us so far.

Hope this helps.

Upvotes: 3

Related Questions