Vivek Kumar
Vivek Kumar

Reputation: 63

Validate to check uploaded file is pdf

How to validate if the file uploaded is PDF only? not only by extension(.pdf) but also with the content.If someone change the extension of any other file to pdf file then it should fail while uploading.

Upvotes: 3

Views: 11122

Answers (4)

sudha thiruvengadam
sudha thiruvengadam

Reputation: 1

try (InputStream inputStream = new FileInputStream(pdfFilePath)) {
    Detector detector = new DefaultDetector();
    Metadata metadata = new Metadata();
    metadata.add(Metadata.RESOURCE_NAME_KEY, new File(pdfFilePath).getName());

    // Detect the MIME type using Tika's Detector
    String mimeType = detector.detect(TikaInputStream.get(inputStream), metadata).toString();
    if (mimeType.equals("application/pdf")) {
        System.out.println("The file is a PDF document.");
        // perform your function

    } else {
        System.out.println("The file is not a valid PDF document.");
    }
}

Upvotes: -1

Kayn Serhal
Kayn Serhal

Reputation: 133

You can use Apache Tika for this, available here. http://tika.apache.org/

You can also find a practical example here: https://dzone.com/articles/determining-file-types-java

Upvotes: 3

Mori Manish
Mori Manish

Reputation: 179

There are many way to validate PDF file. I used itext for check pdf is corrupted or not.

try {
        PdfReader pdfReader = new PdfReader(file);

        PdfTextExtractor.getTextFromPage(pdfReader, 1);

        LOGGER.info("pdfFileValidator ==> Exit");
        return true;
    } catch (InvalidPdfException e) {
        e.printStackTrace();
        LOGGER.error("pdfFileValidator ==> Exit. Error ==> " + e.getMessage());
        return false;
    }

If file is not PDF or file is corrupted than it will throw InvalidPDFException. For above example you need itext library.

Upvotes: 4

mnestorov
mnestorov

Reputation: 4494

There are many validation libraries that you can use in order to validate if a file is PDF compliant. For instance, you can use - veradpf or pdfbox. Of course you can use any other library that would do the work for you. As it was already mentioned, tika is another library that can read file metadata and tell you what the file is.

As an example (a bare one), you can do something with pdfbox. Also keep in mind that this will validate if the file is PDF/A compliant.

boolean validateImpl(File file) {

    PreflightDocument document = new PreflightParser(file).getPreflightDocument();

    try {
        document.validate();
        ValidationResult validationResult = document.getResult();

        if (validationResult.isValid()) {
            return true;
        }

    } catch (Exception e) {
       // Error validating
    }
    return false;
}

or with Tika, you can do something like

public ContentType tikaDetect(File file) {

    Tika tika = new Tika();

    String detectedType = tika.detect(file);
}

Upvotes: 2

Related Questions