Joshua Muheim
Joshua Muheim

Reputation: 13195

Pandoc 2.x renders images' alternative texts in an inaccessible fashion

Since I upgraded from Pandoc v1.19 to 2.9, decorative images are not exported as expected anymore.

First of all, when generating HTML from ![](test.jpg), in v1.19 a <p class="figure"> structure was wrapped around the image, but now it's only a <p>:

<p>
  <img src="test.jpg">
</p>

This makes it harder to style in line with other images that have an alternative text.

But what's really a problem here: there's no alt="" attribute produced anymore! This means that e.g. screen readers will not recognise this as a decorative image anymore.

So let's see what happens to an image with an actual alternative text, e.g. when generating HTML from ![Hello](test.jpg):

<div class="figure">
  <img src="test.jpg" alt="">
  <p class="caption">Hello</p>
</div>

Here we get a class="figure" in the surrounding element, but now it's a <div> instead of a <p> (I don't bother too much about this, but again, it makes it harder to style everything the same).

What again is a big problem though is the fact that the alt attribute is now set empty: this prevents screen readers from perceiving them at all, which is horribly wrong! I guess that Pandoc concludes that having alternative text and caption would be redundant, which is correct, and that the caption below would be the right thing to show - which it is not.

The right structure would look something like this:

<div class="figure">
  <img src="test.jpg" alt="Hello"><!-- Leave the alternative text on the image -->
  <p class="caption" aria-hidden="true">Hello</p><!-- Hide the redundant visual alternative text from screen readers -->
</div>

Any reason why this behaviour would make sense? Can it be changed somehow? Otherwise I will have to fiddle around with some post-processing JavaScript...

Upvotes: 1

Views: 166

Answers (1)

tarleb
tarleb

Reputation: 22544

The ![](test.jpg) example is no longer treated as a figure, because pandoc now requires that

  1. the image is the only element in a paragraph, and
  2. it has a caption.

Wrapping of figures with <div> happens when exporting to HTML4. Using the latest pandoc 2.9.2.1 and running pandoc -t html5 on the input ![Hello](test.jpg)

<figure>
<img src="test.jpg" alt="" /><figcaption>Hello</figcaption>
</figure>

The rationale for emitting an empty alt attribute is that screen readers would read the caption twice: first the alt, then the figcaption. Your suggestion seems much better, please open an issue.

If you can't wait for a new release, then use a Lua filter to create figures the way you like:

function Para (p)
  if #p.content == 1 and p.content[1].t == "Image" then
    local image = p.content[1]
    local figure_content = pandoc.List{}
    figure_content:insert(image)
    figure_content:insert(
      pandoc.RawInline('html', '\n<p class=caption aria-hidden="true">'))
    figure_content:extend(image.caption)
    figure_content:insert(pandoc.RawInline('html', '</p>'))
    local attr = pandoc.Attr("", {"figure"})
    return pandoc.Div({pandoc.Plain(figure_content)}, attr)
  end
end

Upvotes: 1

Related Questions