Sean
Sean

Reputation: 517

Retrieve all tag names from CheerioJS DOM object

Given the following HTML:

<html>
    <head>
        <title>This is text within the title tag</title>
    </head>
    <body>
        This is text in the body tag
        <br>
        <h1>This is text in the h1 tag</h1>
        <p>This is text in the p tag</p>
        There is more text in the body after the p tag
    </body>
</html>

I'm looking to use CheerioJS, an HTML parser, to collect each HTML tag into an array for manipulation purposes.

The desired output would be an array of the following:

[html, head, title, /title, /head, body, br, h1, /h1, p, /p, /body, /html]

I've been looking at Cheerio's DOM object but I'm not sure if it's what I need.

Upvotes: 1

Views: 946

Answers (2)

pguardiario
pguardiario

Reputation: 54984

You could do:

$('*').get().map(el => el.name)
// [ 'html', 'head', 'title', 'body', 'br', 'h1', 'p' ]

Note that closing tags aren't discrete nodes, they're part of the node that the opening tag belongs to.

Upvotes: 2

Lansana Camara
Lansana Camara

Reputation: 9873

I don't think you need an external library for this, you can walk the DOM yourself using a small function.

const list = [];

function walkTheDOM(node, iteratee) {
    iteratee(node);
    node = node.firstChild;

    while (node) {
        walkTheDOM(node, iteratee);
        node = node.nextSibling;
    }
}

walkTheDOM(document.getElementsByTagName('html')[0], function (node) {
    list.push(node)
});

console.log(list);
// [html, head, text, meta, ...]

Here is a Fiddle.

Upvotes: 0

Related Questions