Reputation: 1744
In working on a feed-reading iPhone app which displays nsdata's (html and pdf) in a UIWebView. I am hitting a snag in some PDF validation logic. I have an NSData object which I know contains a file with .pdf extension. I would like to restrict invalid PDFs from getting any further. Here's my first attempt at validation code, which seems to work for a majority of cases:
// pdfData is an NSData *
NSData *validPDF = [[NSString stringWithString:@"%PDF"] dataUsingEncoding: NSASCIIStringEncoding];
if (!(pdfData && [[pdfData subdataWithRange:NSMakeRange(0, 4)] isEqualToData:validPDF])) {
// error
}
Unfortunately, a new pdf was uploaded a few days ago. It is valid in the sense that the UIWebView will display it fine, yet it fails my validation test. I have tracked down the issue to the fact that it was a bunch of garbage bytes at the beginning, with the %PDF coming midway through the 14th set of hex characters (the 25 or % is exactly the 54th byte):
%PDF: 25504446
Breaking PDF: 00010000 00ffffff ff010000 00000000 000f0100 0000b5e0 04000200 01000000 ffffffff 01000000 00000000 0f010000 0099e004 00022550 44462d31 etc...
What is the best practice for validating NSData to be a PDF?
What might be wrong with this particular PDF (it claims it was encoded by PaperPort 11.0, whatever that is)?
Thanks,
Mike
Upvotes: 9
Views: 4403
Reputation: 420
The previous answers don't work for me. There are cases that it returns false for pdf data.
Using this works for me:
func isPDFData(data: Data) {
PDFDocument(data: data) != nil
}
Upvotes: 1
Reputation: 1390
Swift 4
extension Data {
var isPDF: Bool {
guard self.count >= 1024 else { return false }
let pdfHeader = Data(bytes: "%PDF", count: 4)
return self.range(of: pdfHeader, options: [], in: Range(NSRange(location: 0, length: 1024))) != nil
}
}
Upvotes: 5
Reputation: 359
let fileManager = FileManager()
let documentsPath = NSSearchPathForDirectoriesInDomains(.documentDirectory, .userDomainMask, true)[0]
let rootDirectory = "\(documentsPath)/\(caption!)/"
let imageURL = URL(fileURLWithPath: rootDirectory).appendingPathComponent("0")
let ns = NSData(contentsOf: imageURL)
let fileExists = fileManager.fileExists(atPath: imageURL.path)
var isPDF:Bool = false
if (ns?.length)! >= 1024 //only check if bigger
{
var pdfBytes = [UInt8]()
pdfBytes = [ 0x25, 0x50, 0x44, 0x46]
let pdfHeader = NSData(bytes: pdfBytes, length: 4)
let a = ns?.range(of: pdfHeader as Data, options: .anchored, in: NSMakeRange(0, 1024))
if (a?.length)! > 0
{
isPDF = true
}
else
{
isPDF = false
}
}
Upvotes: 4
Reputation: 598
May be try this..
// Validate PDF using NSData
- (BOOL)isValidePDF:(NSData *)pdfData {
BOOL isPDF = false;
if (pdfData.length >= 1024 ) {
int startMetaCount = 4, endMetaCount = 5;
// check pdf data is the NSData with embedded %PDF & %%EOF
NSData *startPDFData = [NSData dataWithBytes:"%PDF" length:startMetaCount];
NSData *endPDFData = [NSData dataWithBytes:"%%EOF" length:endMetaCount];
// startPDFData, endPDFData data are the NSData with embedded in pdfData
NSRange startRange = [pdfData rangeOfData:startPDFData options:0 range:NSMakeRange(0, 1024)];
NSRange endRange = [pdfData rangeOfData:endPDFData options:0 range:NSMakeRange(0, pdfData.length)];
if (startRange.location != NSNotFound && startRange.length == startMetaCount && endRange.location != NSNotFound && endRange.length == endMetaCount ) {
// This assumes the start & end PDFData doesn't have a specific range in file pdf data
isPDF = true;
} else {
isPDF = false;
}
}
return isPDF;
}
Upvotes: 3
Reputation: 2907
In Swift I have the following:
var isPDF:Bool = false
if assetData.length >= 1024 //only check if bigger
{
var pdfBytes = [UInt8]()
pdfBytes = [ 0x25, 0x50, 0x44, 0x46]
let pdfHeader = NSData(bytes: pdfBytes, length: 4)
let foundRange = assetData.rangeOfData(pdfHeader, options: nil, range: NSMakeRange(0, 1024))
if foundRange.length > 0
{
isPDF = true
}
}
Upvotes: 3