Reputation: 2791
I need to create a PDF/UA compliant document in iText7. The most important requirement is tagging of all content. When tagging is enabled (by calling PdfDocument.SetTagged()
method) most elements added to the document get correct tags.
The issue is with tagging of table header cells. According to ISO 32000-1:2008, table header cells must be tagged as TH and table data cells must be tagged as TD (14.8.4.2.4. Table elements, Table 337).
iText allows to distinguish between header cells and regular cells by using Table.AddHeaderCell()
and Table.AddCell()
methods. This mechanism properly creates THead and TBody tags for the groups of rows. Unfortunately, the cells themselves are always marked as TD.
Here is sample code for generating a table:
//var pdfDoc = new PdfDocument(...)
pdfDoc.SetTagged();
var doc = new Document(pdfDoc);
var table = new Table(2);
table.AddHeaderCell("Header 0");
table.AddHeaderCell("Header 1");
table.AddCell("Data 0");
table.AddCell("Data 1");
doc.Add(table);
doc.Close();
Here is an example of tagging structure we are getting:
<Table>
<THead>
<TR>
<TD> //must be TH!
<P>
"Header 0"
<TD>
<P>
"Header 1"
<TBody>
<TR>
<TD> //TD is correct here
<P>
"Data 0"
<TD>
<P>
"Data 1"
Is it possible to have iText generating TH tags when AddHeaderCell()
method is used?
I am using iText 7.0.0 for .NET (Community edition)
Upvotes: 0
Views: 1330
Reputation: 31
Please note that this has changed with iText 7.1. You can no longer call the setRole() function directly, you have to go through the Accessibility Properties. Furthermore, the setRole() function in the Accessibility Properties only accepts a string. So now, it would be:
cell.getAccessibilityProperties().setRole(PdfName.TH.toString());
Upvotes: 1
Reputation: 1719
EDIT: Initial answer was in mistakingly given in the context of pdfHTML and not iText7 proper.
The TH tags getting tagged as TD is a side-effect of the current implementation that treats a TH in the same way as a TD.
For iText7
Set the role of the header-cells to TH before adding them to the table:
cell.setRole(PdfName.TH);
For pdfHTML
While it's possible to access the elements after conversion and before adding them to the document, you'll need to traverse the tree of iText element to find and identify tables and their header -cells. It's easier to to overwrite the conversion behavior of tags with a CustomTagWorker. The following code is taken from the accessibility example. For a primer on custom tagworkers, have a look at the configuration blog-post.
Start by creating a custom tagworker that inherits from a TdTagWorker
, but overwrites the role right before returning the element-result:
public class TableHeaderTagWorker extends TdTagWorker {
public TableHeaderTagWorker(IElementNode element, ProcessorContext context) {
super(element, context);
}
@Override
public IPropertyContainer getElementResult() {
Cell cell =(Cell) super.getElementResult();
cell.setRole(PdfName.TH);
return super.getElementResult();
}
}
Create a CustomTagWorkerFactory
that maps this TagWorker
to the TH
-tag
public class AccessibilityTagWorkerFactory extends DefaultTagWorkerFactory {
@Override
public ITagWorker getCustomTagWorker(IElementNode tag, ProcessorContext context) {
//This can probably replaced with a regex or string pattern
if(tag.name().equals("h1")){
return new HeaderTagWorker(tag, context,1);
}
if(tag.name().equals("h2")){
return new HeaderTagWorker(tag, context,2);
}
if(tag.name().equals("h3")){
return new HeaderTagWorker(tag, context,3);
}
if(tag.name().equals("h4")){
return new HeaderTagWorker(tag, context,4);
}
if(tag.name().equals("h5")){
return new HeaderTagWorker(tag, context,5);
}
if(tag.name().equals("h6")){
return new HeaderTagWorker(tag, context,6);
}
if(tag.name().equals("th")){
return new TableHeaderTagWorker(tag,context);
}
return null;
}
}
And set the ConvertorProperties to use this custom factory:
ConverterProperties props = new ConverterProperties();
DefaultTagWorkerFactory tagWorkerFactory = new AccessibilityTagWorkerFactory();
props.setTagWorkerFactory(tagWorkerFactory);
HtmlConverter.convertToPdf(new FileInputStream(src), pdfDoc, props);
pdfDoc.close();
Upvotes: 5