Reputation: 277
I write a pdf cracking and found the password of the protected pdf file. I want to write a program in Python that can display that pdf file on the screen without password.I use the PyPDF library. I know how to open a file without the password, but can't figure out the protected one.Any idea? Thanks
filePath = raw_input()
password = 'abc'
if sys.platform.startswith('linux'):
subprocess.call(["xdg-open", filePath])
Upvotes: 13
Views: 41619
Reputation: 11
Updated version of Bohumir Zamecnik's Code. for PyPDF 3.0.0 and Above
from PyPDF2 import PdfReader, PdfWriter
def decrypt_pdf(input_path, output_path, password):
with open(input_path, 'rb') as input_file, \
open(output_path, 'wb') as output_file:
reader = PdfReader(input_file)
reader.decrypt(password)
writer = PdfWriter()
for i in range(len(reader.pages)):
writer.add_page(reader.pages[i])
writer.write(output_file)
decrypt_pdf('encrypr.pdf', 'decrypted.pdf', 'password')
Upvotes: 1
Reputation: 1062
You can use pdfplumber library. Super easy to use and reads machine written pdf files seamlessly, better than any other library i have used.
import pdfplumber
with pdfplumber.open(r'D:\examplepdf.pdf' , password = 'abc') as pdf:
first_page = pdf.pages[0]
print(first_page.extract_text())
Upvotes: 4
Reputation: 646
You should use pikepdf library nowadays instead:
import pikepdf
with pikepdf.open("input.pdf", password="abc") as pdf:
num_pages = len(pdf.pages)
print("Total pages:", num_pages)
PyPDF2
doesn't support many encryption algorithms, pikepdf
seems to solve them, it supports most of password protected methods, and also documented and actively maintained.
Upvotes: 12
Reputation: 2815
The approach shown by KL84 basically works, but the code is not correct (it writes the output file for each page). A cleaned up version is here:
https://gist.github.com/bzamecnik/1abb64affb21322256f1c4ebbb59a364
# Decrypt password-protected PDF in Python.
#
# Requirements:
# pip install PyPDF2
from PyPDF2 import PdfFileReader, PdfFileWriter
def decrypt_pdf(input_path, output_path, password):
with open(input_path, 'rb') as input_file, \
open(output_path, 'wb') as output_file:
reader = PdfFileReader(input_file)
reader.decrypt(password)
writer = PdfFileWriter()
for i in range(reader.getNumPages()):
writer.addPage(reader.getPage(i))
writer.write(output_file)
if __name__ == '__main__':
# example usage:
decrypt_pdf('encrypted.pdf', 'decrypted.pdf', 'secret_password')
Upvotes: 18
Reputation: 277
I have the answer for this question. Basically, the PyPDF2 library needs to install and use in order to get this idea working.
#When you have the password = abc you have to call the function decrypt in PyPDF to decrypt the pdf file
filePath = raw_input("Enter pdf file path: ")
f = PdfFileReader(file(filePath, "rb"))
output = PdfFileWriter()
f.decrypt ('abc')
# Copy the pages in the encrypted pdf to unencrypted pdf with name noPassPDF.pdf
for pageNumber in range (0, f.getNumPages()):
output.addPage(f.getPage(pageNumber))
# write "output" to noPassPDF.pdf
outputStream = file("noPassPDF.pdf", "wb")
output.write(outputStream)
outputStream.close()
#Open the file now
if sys.platform.startswith('darwin'):#open in MAC OX
subprocess.call(["open", "noPassPDF.pdf"])
Upvotes: 1