Reputation: 167
So I've been browsing around the source code / documentation for POI (specifically XWPF) and I can't seem to find anything that relates to editing a hyperlink in a .docx. I only see functionality to get the information for the currently set hyperlink. My goal is to change the hyperlink in a .docx to link to "http://yahoo.com" from "http://google.com" as an example. Any help would be greatly appreciated. Thanks!
Upvotes: 5
Views: 1198
Reputation: 61945
This requirement needs knowledge about how hyperlinks referring to an external reference get stored in Microsoft Word documents and how this gets represented in XWPF
of Apache POI.
The XWPFHyperlinkRun
is the representation of a linked text run in a IRunBody
. This text run, or even multiple text runs, is/are wrapped with a XML object of type CTHyperlink
. This contains a relation ID which points to a relation in the package relations part. This package relation contains the URI which is the hyperlink's target.
Currently (apache poi 5.2.2
) XWPFHyperlinkRun
provides access to a XWPFHyperlink
. But this is very rudimentary. It only has getters for the Id and the URI. It neither provides access to it's XWPFHyperlinkRun and it's IRunBody nor it provides a setter for the target URI in the package relations part. It not even has internally access to it's the package relations part.
So only using Apache POI classes the only possibility currently is to delete the old XWPFHyperlinkRun
and create a new one pointing to the new URI. But as the text runs also contain the text formatting, deleting them will also delete the text formatting. It would must be copied from the old XWPFHyperlinkRun
to the new before deleting the old one. That's uncomfortable.
So the rudimentary XWPFHyperlink
should be extended to provide a setter for the target URI in the package relations part. A new class XWPFHyperlinkExtended
could look like so:
import org.apache.poi.xwpf.usermodel.*;
import org.apache.poi.openxml4j.opc.PackageRelationship;
import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
/**
* Extended XWPF hyperlink class
* Provides access to it's Id, URI, XWPFHyperlinkRun, IRunBody.
* Provides setting target URI in PackageRelationship.
*/
public class XWPFHyperlinkExtended {
private String id;
private String uri;
private XWPFHyperlinkRun hyperlinkRun;
private IRunBody runBody;
private PackageRelationship rel;
public XWPFHyperlinkExtended(XWPFHyperlinkRun hyperlinkRun, PackageRelationship rel) {
this.id = rel.getId();
this.uri = rel.getTargetURI().toString();
this.hyperlinkRun = hyperlinkRun;
this.runBody = hyperlinkRun.getParent();
this.rel = rel;
}
public String getId() {
return this.id;
}
public String getURI() {
return this.uri;
}
public IRunBody getIRunBody() {
return this.runBody;
}
public XWPFHyperlinkRun getHyperlinkRun() {
return this.hyperlinkRun;
}
/**
* Provides setting target URI in PackageRelationship.
* The old PackageRelationship gets removed.
* A new PackageRelationship gets added using the same Id.
*/
public void setTargetURI(String uri) {
this.runBody.getPart().getPackagePart().removeRelationship(this.getId());
this.uri = uri;
PackageRelationship rel = this.runBody.getPart().getPackagePart().addExternalRelationship(uri, XWPFRelation.HYPERLINK.getRelation(), this.getId());
this.rel = rel;
}
}
It does not extend XWPFHyperlink
as this is so rudimentary it's not worth it. Furthermore after setTargetURI
the String uri
needs to be updated. But there is no setter in XWPFHyperlink
and the field is only accessible from inside the package.
The new XWPFHyperlinkExtended
can be got from XWPFHyperlinkRun
like so:
/**
* If this HyperlinkRun refers to an external reference hyperlink,
* return the XWPFHyperlinkExtended object for it.
* May return null if no PackageRelationship found.
*/
/*modifiers*/ XWPFHyperlinkExtended getHyperlink(XWPFHyperlinkRun hyperlinkRun) {
try {
for (org.apache.poi.openxml4j.opc.PackageRelationship rel : hyperlinkRun.getParent().getPart().getPackagePart().getRelationshipsByType(XWPFRelation.HYPERLINK.getRelation())) {
if (rel.getId().equals(hyperlinkRun.getHyperlinkId())) {
return new XWPFHyperlinkExtended(hyperlinkRun, rel);
}
}
} catch (org.apache.poi.openxml4j.exceptions.InvalidFormatException ifex) {
// do nothing, simply do not return something
}
return null;
}
Once we have that XWPFHyperlinkExtended
we can set an new target URI using it's method setTargetURI
.
A further problem results from the fact, that the XML object of type CTHyperlink
can wrap around multiple text runs, not only one. Then multiple XWPFHyperlinkRun
are in one CTHyperlink
and point to one target URI. For example this could look like:
... [this is a link to example.com]->https://example.com ...
This results in 6 XWPFHyperlinkRun
s in one CTHyperlink
linking to https://example.com.
This leads to problems when link text needs to be changed when the link target changes. The text of all the 6 text runs is the link text. So which text run shall be changed?
The best I have found is a method which sets the text of the first text run in the CTHyperlink.
/**
* Sets the text of the first text run in the CTHyperlink of this XWPFHyperlinkRun.
* Tries solving the problem when a CTHyperlink contains multiple text runs.
* Then the String value is set in first text run only. All other text runs are set empty.
*/
/*modifiers*/ void setTextInFirstRun(XWPFHyperlinkRun hyperlinkRun, String value) {
org.openxmlformats.schemas.wordprocessingml.x2006.main.CTHyperlink ctHyperlink = hyperlinkRun.getCTHyperlink();
for (int r = 0; r < ctHyperlink.getRList().size(); r++) {
org.openxmlformats.schemas.wordprocessingml.x2006.main.CTR ctR = ctHyperlink.getRList().get(r);
for (int t = 0; t < ctR.getTList().size(); t++) {
org.openxmlformats.schemas.wordprocessingml.x2006.main.CTText ctText = ctR.getTList().get(t);
if (r == 0 && t == 0) {
ctText.setStringValue(value);
} else {
ctText.setStringValue("");
}
}
}
}
There the String
value is set in first text run only. All other text runs are set empty. The text formatting of the first text run remains.
Upvotes: 2
Reputation: 1112
I found a way to edit the url of the link in a "indirect way" (copy the previous hyperlink, modify the url, delete the previous hyperlink and add the new one in the paragraph).
Code is shown below:
private void editLinksOfParagraph(XWPFParagraph paragraph, XWPFDocument document) {
for (int rIndex = 0; rIndex < paragraph.getRuns().size(); rIndex++) {
XWPFRun run = paragraph.getRuns().get(rIndex);
if (run instanceof XWPFHyperlinkRun) {
// get the url of the link to edit it
XWPFHyperlink link = ((XWPFHyperlinkRun) run).getHyperlink(document);
String linkURL = link.getURL();
//get the xml representation of the hyperlink that includes all the information
XmlObject xmlObject = run.getCTR().copy();
linkURL += "-edited-link"; //edited url of the link, f.e add a '-edited-link' suffix
//remove the previous link from the paragraph
paragraph.removeRun(rIndex);
//add the new hyperlinked with updated url in the paragraph, in place of the previous deleted
XWPFHyperlinkRun hyperlinkRun = paragraph.insertNewHyperlinkRun(rIndex, linkURL);
hyperlinkRun.getCTR().set(xmlObject);
}
}
}
Upvotes: 2
Reputation: 661
This works, but need more some steps to get text formatting correctly:
try (var fis = new FileInputStream(fileName);
var doc = new XWPFDocument(fis)) {
var pList = doc.getParagraphs();
for (var p : pList) {
var runs = p.getRuns();
for (int i = 0; i < runs.size(); i++) {
var r = runs.get(i);
if (r instanceof XWPFHyperlinkRun) {
var run = (XWPFHyperlinkRun) r;
var link = run.getHyperlink(doc);
// To get text: link for checking
System.out.println(run.getText(0) + ": " + link.getURL());
// how i replace it
var run1 = p.insertNewHyperlinkRun(i, "http://google.com");
run1.setText(run.getText(0));
// remove the old link
p.removeRun(i + 1);
}
}
}
try (var fos = new FileOutputStream(outFileName)) {
doc.write(fos);
}
}
I'm using these libraries:
implementation 'org.apache.poi:poi:5.2.2'
implementation 'org.apache.poi:poi-ooxml:5.2.2'
Upvotes: 0