Reputation: 197
Is it possible to return the total page count of an external PDF file via XSL? Does the AntennaHouse Formatter have an equivalent extention?
Thanks in advance!
Upvotes: 0
Views: 997
Reputation: 1304
If you are using Java based XSLT processor which allows external function call (such as Saxon PE or EE), then Apache PDFBox will help you.
PDFBox: https://pdfbox.apache.org/
PDFBox’s PDDocument class has the method that returns page count of the target PDF. So you can get page count by following step:
[Java sample code]
package com.acme.pdfutil;
import java.io.File;
import org.apache.pdfbox.pdmodel.PDDocument;
public class pdfDocument {
/**
* Get the page count of specified PDF file.
* @param filePath
* @return Page count
*/
public static int getPageCount(String filePath){
File pdfFile = null;
PDDocument pdfDoc = null;
int pageCount = -1;
try {
pdfFile = new File(filePath);
pdfDoc = PDDocument.load(pdfFile);
pageCount = pdfDoc.getNumberOfPages();
}
catch (Exception e) {
System.out.println("[getPageCount] " + e.getMessage());
}
finally {
if (pdfDoc != null){
try{
pdfDoc.close();
}
catch (Exception e) {
;
}
}
}
return pageCount;
}
}
[XSLT stylesheet]
<xsl:stylesheet version="2.0"
xmlns:fo="http://www.w3.org/1999/XSL/Format"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:acmejava="java:com.acme.pdfutil.pdfDocument"
>
…
<!-- Call external function -->
<xsl:variable name=”pdfPageCount” as="xs:integer" select="acmejava:getPageCount($pdfPath)"/>
…
Upvotes: 2
Reputation: 8068
Not out of the box, no. Ways to do it would include:
grep
, etc., on the PDF and save the output of that to a file to be read. See, e.g., http://www.unix.com/printthread.php?t=55661&pp=40unparsed-text()
then use XSLT's regular expression ability to find the right string(s).Upvotes: 2