Jum Remdesk
Jum Remdesk

Reputation: 111

Check password on a pdf file without opening

I'm trying to make brute force attack method to find password on pdf files. I use this method to check password using itextsharp

public static bool IsPasswordValid(string pdfFullname, byte[] password) {
    try {
        PdfReader pdfReader = new PdfReader(pdfFullname, password);
        return true;
    } catch (BadPasswordException) {
        return false;
    }
}

It works. But take very long time when the PDF file size is large. Is there a way check password without loading file to memory?.

Upvotes: 0

Views: 2712

Answers (1)

Chris Haas
Chris Haas

Reputation: 55457

A couple of things.

First, in my blog post here I found that using the constructor overload that takes a RandomAccessFileOrArray to be the fastest. I just tested again with source code that I've got sitting around (5.5.6.0) and it is still twice as fast as the string-based path method (for large files). However, that constructor overload is marked as obsolete and might in fact actually be removed in more recent versions so you'll need to deal/live with that.

PdfReader pdfReader = new PdfReader(new RandomAccessFileOrArray(pdfFullname, true), password);

To be very clear, that overload was specifically removed for some reason so you might want to consider not using it.

Second, the question Is there a way check password without loading file to memory? is actually misleading. If memory were actually the bottleneck you'd have much bigger problems. I just did a quick test using System.IO.File.ReadAllBytes() on an 80MB file and it took about 90 milliseconds to load everything into memory and my computer is seven years old.

The actual speed problem is that iText needs to find the trailer of the PDF which has a pointer to the /Encrypt dictionary. Because iText is intended to be used to actually do something with a PDF it doesn't spend a whole lot of time optimizing this path because its going to have to happen eventually no matter what. If you really, really care about speed this is where I'd start. I'd recommend checking out Adobe's spec to see how the standard PDF encryption works, it's relatively simple. There's also a great simplified description here.

If your goal is cracking you should be able to write a very crude password guesser that looks for the trailer, looks for and find the /Encrypt key and process the /O and /U keys. There's lots of "gotchas" in this if you haven't read the spec, for instance documents can have multiple trailer entries, there's alternatives to passwords, etc. but this should probably get you 99% of common PDFs out there.

Upvotes: 3

Related Questions