Andre
Andre

Reputation: 53

How to transform an html ul tree to 1D xml using xslt 1.0?

How to transform an html ul tree to 1D xml using xslt 1.0?

I want to transform the chat tree into a flat list. This is my first time working with xml and I have already done 2 other transformations. But I couldn't make xml from html. How is this done?

An input html structure with a nested chat tree with messages and names.

<!DOCTYPE html>
<html>
<head>
    <title>Chat</title>
</head>
<body>
    <ul>
        <li>
            <b>Name 1</b> say: Hi!<ul>
                <li>
                    <b>Name 2</b> say: Hello<ul>
                        <li>
                            <b>Name 3</b> say: Ho ho
                        </li>
                        <li>
                            <b>Name 4</b> say: How do you do?<ul>
                                <li>
                                    <b>Name 5</b> say: I'm fine<ul>
                                        <li>
                                            <b>Name 4</b> say: Ok
                                        </li>
                                    </ul>
                                </li>
                            </ul>
                        </li>
                    </ul>
                </li>
                <li>
                    <b>Name 3</b> say: Hi. How do you do?<ul>
                        <li>
                            <b>Name 1</b> say: Fine
                        </li>
                    </ul>
                </li>
            </ul>
        </li>
    </ul>
</body>
</html>

Expected xml output (clarification: pid is the parent id):

<?xml version="1.0" encoding="UTF-8"?>
<items>
    <item id="1" pid="0" name="Name 1">Hi!</item>
    <item id="2" pid="1" name="Name 2">Hello</item>
    <item id="3" pid="2" name="Name 3">Ho ho</item>
    <item id="4" pid="2" name="Name 4">How do you do?</item>
    <item id="5" pid="4" name="Name 5">I'm fine</item>
    <item id="6" pid="5" name="Name 4">Ok</item>
    <item id="7" pid="1" name="Name 3">Hi. How do you do?</item>
    <item id="8" pid="7" name="Name 1">Fine</item>
</items>

Maybe like this? But it doesn't work with full html.

<xsl:stylesheet version="1.0" encoding="utf-8"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" indent="yes"/>
    <xsl:template match="b">
        <item>
            <xsl:attribute name="name">
                <xsl:value-of select="@*|node()"/>
            </xsl:attribute>
        </item>
    </xsl:template>
</xsl:stylesheet>

Upvotes: 0

Views: 57

Answers (1)

michael.hor257k
michael.hor257k

Reputation: 117100

Try this as your starting point:

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:template match="/">
    <items>
        <xsl:for-each select="//li">
            <item id="{generate-id()}" pid="{generate-id(ancestor::li[1])}" name="{b[1]}">
                <xsl:value-of select="normalize-space(substring-after(text()[1], 'say: '))" />
            </item>
        </xsl:for-each>
    </items>
</xsl:template>

</xsl:stylesheet>

Note that the format of the id and pid values is processor-dependent.

Upvotes: 2

Related Questions