Eliot Kim
Eliot Kim

Reputation: 63

Can I decrypt a different encoded pdf stream from Azure blob storage than utf-8 with pikepdf?

I am accessing an encrypted pdf file saved in blob storage on Azure. I get the error UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe2 in position 10: invalid continuation byte when I run the following code:

date = str(datetime.date.today().strftime("%Y%m%d%H%M%S"))
dec = "C:/Users/python-test/source_dir/WOTC_DENIALS_Decrypted-{0}.pdf".format(date)

blob_service_client = BlobServiceClient.from_connection_string(constr)
container_client = blob_service_client.get_container_client(container_name)
blob_client = container_client.get_blob_client(filename)
streamdownloader = blob_client.download_blob().readall()

pdf = pikepdf.Pdf.open(streamdownloader, password=password)
pdf.save(dec)
pdfFileObj = open(dec, 'rb')

pdfReader = PyPDF2.PdfFileReader(pdfFileObj)

ending_page = pdfReader.numPages

print("Total number of pages detected accurately? ")
print(pdfReader.numPages)

Upvotes: 0

Views: 246

Answers (0)

Related Questions