Tamer Rifai
Tamer Rifai

Reputation: 177

iText 7 accessible PDFs from HTML: how do I avoid table tag when using display: table;

I am working on an ASP.NET MVC project which converts a view to a PDF. Previously, Rotavia was used, but then a new client requirement is that the PDF be accessible/508-compliant. For layout purposes, the previous developer had a whole header section (logo, title, disclaimer, etc) as a table without th elements (just td). I needed to convert them to divs but keep the look the same. So, what I did was made them divs and then used the CSS properties, display: table, display: table-row-group, display: table-row, and display: table-cell where appropriate. It ended up looking pretty much exactly the same.

The issue is, even though they are now divs, using iText DefaultTagWorkerFactory like this:

ConverterProperties props = new ConverterProperties();
FontProvider fp = new FontProvider();
fp.AddStandardPdfFonts();
props.SetFontProvider(fp);
var tagWorkerFactory = new DefaultTagWorkerFactory();
props.SetTagWorkerFactory(tagWorkerFactory);
HtmlConverter.ConvertToPdf(html, pdfDoc, props);

It still converts the Div tags to table, tr, and td tags. Obviously, the whole purpose of using display: table on a div is to avoid using a table but get the same layout effect.

Why does iText implement this behavior and is there any way around it? If not, can anyone provide any precise CSS equivalents of display: table, display: table-row-group, display: table-row, and display: table-cell because it appears that iText just sees the property, "display: table" and uses the table-tag. I have tried the following in a custom tag worker factory that inherits the DefaultTagWorkerFactory by adding a class to my divs, "make-div", like this:

public class AccessibilityTagWorkerFactory : DefaultTagWorkerFactory
    {
        public override ITagWorker GetCustomTagWorker(IElementNode tag, ProcessorContext context)
        {
            var attributes = tag.GetAttributes();
            var cssClass = attributes.GetAttribute("class");
            if (!string.IsNullOrWhiteSpace(cssClass) && cssClass.Contains("make-div"))
            {
                return new DivTagWorker(tag, context);
            }
            return null;
        }
    }

However, it throws an exception like "Cannot implicitly convert DivTagWorker to DisplayTableRowTagWorker".

I'm at my wits end with all of this. Any help would be appreciated. Thank you.

Upvotes: 4

Views: 1476

Answers (2)

Tamer Rifai
Tamer Rifai

Reputation: 177

@AlexeySubach posted a very helpful answer. It did not work 100% for me but really helped me in the right direction. Also, his answer is in Java and I was working with the .NET version of iText. Here is what I ended up doing: I left the HTML markup as tables and, since I had some tables that should be tables and others that shouldn't (they were just for layout), I added the class, "make-table-div" to the table elements that I did not want to be tables. Then, I created this class:

public class DivRoleTableTagWorker : TableTagWorker
{
    public DivRoleTableTagWorker(IElementNode element, ProcessorContext context) : base(element, context)
    {
    }
    public override void ProcessEnd(IElementNode element, ProcessorContext context)
    {
        base.ProcessEnd(element, context);
        if (GetElementResult().GetType() == typeof(Table))
        {
            Table table = (Table)GetElementResult();
            table.GetAccessibilityProperties().SetRole(StandardRoles.DIV);
            for (int i = 0; i < table.GetNumberOfRows(); i++)
            {
                for (int j = 0; j < table.GetNumberOfColumns(); j++)
                {
                    Cell cell = table.GetCell(i, j);
                    if (cell != null)
                    {
                        cell.GetAccessibilityProperties().SetRole(StandardRoles.DIV);
                    }
                }
            }
        }
    }
}

And

public class AccessibilityTagWorkerFactory : DefaultTagWorkerFactory
{
public override ITagWorker GetCustomTagWorker(IElementNode tag, ProcessorContext context)
    {
        bool hasClass = false;
        foreach (var attribute in tag.GetAttributes())
        {
            if (attribute.GetKey() == "class")
            {
                hasClass = true;
            }
        }
        if (hasClass && tag.GetAttribute(AttributeConstants.CLASS).Contains("make-h1"))
        {
            return new HRoleSpanTagWorker(tag, context, StandardRoles.H1);
        }
        if (hasClass && tag.GetAttribute(AttributeConstants.CLASS).Contains("make-h2"))
        {
            return new HRoleSpanTagWorker(tag, context, StandardRoles.H2);
        }
        if (hasClass && tag.GetAttribute(AttributeConstants.CLASS).Contains("make-table-div"))
        {
            return new DivRoleTableTagWorker(tag, context);
        }
        return base.GetCustomTagWorker(tag, context);
    }
}

And, finally, to use it:

        ConverterProperties props = new ConverterProperties();
        FontProvider fp = new FontProvider();
        fp.AddStandardPdfFonts();
        props.SetFontProvider(fp);
        DefaultTagWorkerFactory tagWorkerFactory = new AccessibilityTagWorkerFactory();
        props.SetTagWorkerFactory(tagWorkerFactory);
        HtmlConverter.ConvertToPdf(html, pdfDoc, props);
        pdfDoc.Close();

In this way, I was able to keep the PDF looking exactly the same and still have the correct tags. Thank you so much for your help.

Upvotes: 0

Alexey Subach
Alexey Subach

Reputation: 12312

You seem to have substituted a tag worker for display: table-row element, but did not do so for display: table. Of course I cannot tell for sure because you haven't shared a sample HTML file to reproduce the issue.

In any case, the approach of substituting tag workers is not the best here. It would just drop layout adjustments caused by custom display CSS property, whereas your goal is to save layout but just change the tagging.

To save layout and change the tagging you should not change the tag worker that is created, but rather set correct roles for Table and Cell layout elements that are created as result of those tag workers.

The corresponding overloaded tag worker might look like following:

private static class DivRoleDisplayTableTagWorker extends DisplayTableTagWorker {
    public DivRoleDisplayTableTagWorker(IElementNode element, ProcessorContext context) {
        super(element, context);
    }

    @Override
    public void processEnd(IElementNode element, ProcessorContext context) {
        super.processEnd(element, context);
        if (getElementResult() instanceof Table) {
            Table table = (Table) getElementResult();
            table.getAccessibilityProperties().setRole(StandardRoles.DIV);
            for (int i = 0; i < table.getNumberOfRows(); i++) {
                for (int j = 0; j < table.getNumberOfColumns(); j++) {
                    Cell cell = table.getCell(i, j);
                    if (cell != null) {
                        cell.getAccessibilityProperties().setRole(StandardRoles.DIV);
                    }
                }
            }
        }
    }
}

All you need to do is substitute tag worker for div with display: table with your custom tag worker. Depending on your document you might use different conditions, but for a simple case the custom tag worker factory would look like this:

ITagWorkerFactory customFactory = new DefaultTagWorkerFactory() {
    @Override
    public ITagWorker getCustomTagWorker(IElementNode tag, ProcessorContext context) {
        if (CssConstants.TABLE.equals(tag.getStyles().get(CssConstants.DISPLAY)) && TagConstants.DIV.equals(tag.name())) {
            return new DivRoleDisplayTableTagWorker(tag, context);
        }
        return super.getCustomTagWorker(tag, context);
    }
};

Now your table and its cells would be tagged as divs.

Upvotes: 4

Related Questions