Reputation: 841
Does anyone know of the best way to take a pdf document, and replace all subs strings that match a pattern ( [A-Z][A-Z][A-Z] ' ' [0-9][0-9][0-9][0-9]|[A-Z] ), and replace it with a hyperlink of the same string going to the same string.
I plan to allow a user to view the pdf document (which is a list of classes they can take for a degree), and allow the user to click a class, inorder to add it to a list.
I understand that I can add a hyperlinklistener to a JEditorPane, and I am assuming that it will work on hyperlinks in a pdf (I hope)
I am looking into pdfbox and iText, but so far I am stuck on how to replace the text.
*I plan to pull the pdfs from a URL, and format the hyperlinks on the fly (no need to export to a file either).
Looking forward to feed back.
Upvotes: 1
Views: 2666
Reputation: 841
I found this example http://pdfbox.apache.org/apidocs/org/apache/pdfbox/examples/pdmodel/ReplaceString.html
By incorporating a Pattern matcher in the code, I am able to update the text and replace strings that match the pattern with strings based on the string I am replacing.
Upvotes: 2