anderswb
anderswb

Reputation: 492

Unittesting pdf generated from website

I'm writing a package which is used for generating pdf files, by posting some data to a website and retrieving a generated pdf from the data.

My problem is with the unittests. So fare I've tried to post a known dataset to the website, retrieving the pdf and compared it to a pdf which I know is good. This works fine, however there's a timestamp in the pdf which means that next day it doesn't work.

As I can see it i have three options.

So, I guess my question is two-fold. 1: How difficult would the pdf to image to blanking method be, and 2: From a unittesting perspective, would it be a better approach to make a mock website or just test that I get some pdf-like data.

Upvotes: 1

Views: 461

Answers (1)

Ethan Furman
Ethan Furman

Reputation: 69051

  • option 4: figure out where the time stamp lives in the pdf, and compare the bytes before and after

For example, if the time stamp is at offset 11 and is two bytes long:

with open('reference.pdf') as rf:
    reference_data = rf.read()
with open('pdf_from_website.pdf') as wf:
    website_data = wf.read()
self.assertEqual(reference_data[:11], website_data[:11])
self.assertEqual(reference_data[13:], website_data[13:])

I'm not familiar with the innards of pdf files so this might not work. You could use diff to see where the differences are and try, though.

For your second question: It is best if you can test that the returned pdf is both valid and has the contents it should have.

Upvotes: 1

Related Questions