Canoe
Canoe

Reputation: 347

git compiling: Documentation/git-add.xml does not validate

When compiling git I have these errors:

make[2]: Leaving directory `/home/xxx/git-master'
    XMLTO git-add.1
xmlto: /home/xxx/git-master/Documentation/git-add.xml does not validate (status 3)
xmlto: Fix document syntax or use --skip-validation option
I/O error : Attempt to load network entity http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd
/home/xxx/git-master/Documentation/git-add.xml:2: warning: failed to load external entity "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"
D DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"
I/O error : Attempt to load network entity http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd
warning: failed to load external entity "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"
validity error : Could not load the external subset "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"
Document /home/xxx/git-master/Documentation/git-add.xml does not validate
make[1]: *** [git-add.1] Error 13
make[1]: Leaving directory `/home/xxx/git-master/Documentation'
make: *** [doc] Error 2

What is the main problem?

Upvotes: 7

Views: 2851

Answers (5)

VonC
VonC

Reputation: 1324827

Note: you might find this error again in more recent (2019, seven years later) build, because Git is starting to use DocBook 5 (instead of DocBook 4.5), as Asciidoctor 2.0 no longer works with the older one.
Git 2.24 (Q4 2019) clarifies the situation.

See commit f6461b8 (15 Sep 2019) by brian m. carlson (bk2204).
(Merged by Junio C Hamano -- gitster -- in commit faf5576, 06 Oct 2019)

Documentation: fix build with Asciidoctor 2

Our documentation toolchain has traditionally been built around DocBook 4.5.
This version of DocBook is the last DTD-based version of DocBook.
In 2009, DocBook 5 was introduced using namespaces and its syntax is expressed in RELAX NG, which is more expressive and allows a wider variety of syntax forms.

Asciidoctor, one of the alternatives for building our documentation, moved support for DocBook 4.5 out of core in its recent 2.0 release and now only supports DocBook 5 in the main release.
The DocBoook 4.5 converter is still available as a separate component, but this is not available in most distro packages.
This would not be a problem but for the fact that we use xmlto, which is still stuck in the DocBook 4.5 era.

xmlto performs DTD validation as part of the build process.
This is not problematic for DocBook 4.5, which has a valid DTD, but it clearly cannot work for DocBook 5, since no DTD can adequately express its full syntax.
In addition, even if xmlto did support RELAX NG validation, that wouldn't be sufficient because it uses the libxml2-based xmllint to do so, which has known problems with validating interleaves in RELAX NG.

Fortunately, there's an easy way forward: ask Asciidoctor to use its DocBook 5 backend and tell xmlto to skip validation.
Asciidoctor has supported DocBook 5 since v0.1.4 in 2013 and xmlto has supported skipping validation for probably longer than that.

We also need to teach xmlto how to use the namespaced DocBook XSLT stylesheets instead of the non-namespaced ones it usually uses.
Normally these stylesheets are interchangeable, but the non-namespaced ones have a bug that causes them not to strip whitespace automatically from certain elements when namespaces are in use.
This results in additional whitespace at the beginning of list elements, which is jarring and unsightly.

We can do this by passing a custom stylesheet with the -x option that simply imports the namespaced stylesheets via a URL.
Any system with support for XML catalogs will automatically look this URL up and reference a local copy instead without us having to know where this local copy is located. We know that anyone using xmlto will already have catalogs set up properly since the DocBook 4.5 DTD used during validation is also looked up via catalogs.
All major Linux distributions distribute the necessary stylesheets and have built-in catalog support, and Homebrew does as well, albeit with a requirement to set an environment variable to enable catalog support.

On the off chance that someone lacks support for catalogs, it is possible for xmlto (via xmllint) to download the stylesheets from the URLs in question, although this will likely perform poorly enough to attract attention.
People still have the option of using the prebuilt documentation that we ship, so happily this should not be an impediment.

Finally, we need to filter out some messages from other stylesheets that occur when invoking dblatex in the CI job.
This tool strips namespaces much like the unnamespaced DocBook stylesheets and prints similar messages.
If we permit these messages to be printed to standard error, our documentation CI job will fail because we check standard error for unexpected output.
Due to dblatex's reliance on Python 2, we may need to revisit its use in the future, in which case this problem may go away, but this can be delayed until a future patch.

The final message we filter is due to libxslt on modern Debian and Ubuntu.
The patch which they use to implement reproducible ID generation also prints messages about the ID generation.
While this doesn't affect our current CI images since they use Ubuntu 16.04 which lacks this patch, if we upgrade to Ubuntu 18.04 or a modern Debian, these messages will appear and, like the above messages, cause a CI failure.

Upvotes: 2

jonseymour
jonseymour

Reputation: 1086

On OSX (Mountain Lion) I had to do this:

brew install asciidoc
brew install xmlto
brew install docbook   

# then (as prompted by brew...)
#
# If you intend to process AsciiDoc files through an XML stage
# (such as a2x for manpage generation) you need to add something
# like:
#
export XML_CATALOG_FILES=/usr/local/etc/xml/catalog
#
# to your shell rc file so that xmllint can find AsciiDoc's
# catalog files.

brew install docbook-xsl

(Thanks to Nathan for providing the necessary hints). #6chars

Upvotes: 7

Ricardo Mendes
Ricardo Mendes

Reputation: 341

jonseymour answer helped me in Mac OS X El Capitan

to export XML_CATALOG_FILES=/usr/local/etc/xml/catalog I do this

sudo vim ~/.bash_profile

(in an empty line insert)
export XML_CATALOG_FILES=/usr/local/etc/xml/catalog

save & exit

. ~/.bash_profile

solved

Upvotes: 2

Corey
Corey

Reputation: 579

Late to the party, but on cygwin the package you need for this to validate is docbook-xml45 (as implied by the DTD URI, docbook/xml/4.5/docbookx.dtd)

Upvotes: 7

Nathan Zook
Nathan Zook

Reputation: 21

This appears to be something of a reoccurring issue for git. In hunting down the solution (today), I ran across it in several forums. (Linux, Cygwin, Mac OS). The problem is always the same: lack of a good docbook catalog. Unfortunately, installing the appropriate catalog is HIGHLY dependent on your installation, and there is more than one way to lack a good catalog.

  1. A bad catalog was released a few years ago. Uninstall & install the update.
  2. The package that built the catalog failed part way. Remove & reinstall the package.
  3. The package that installs the catalog has not been installed, and the package tools haven't taken care of you.

Option 3 is where I was. I have brew installed, so sudo brew install docbook sudo docbook-register Took care of this problem for me.

Alternatively, there is a separate download of just the docs available.

Upvotes: 2

Related Questions