Reputation: 517
Given the following HTML:
<html>
<head>
<title>This is text within the title tag</title>
</head>
<body>
This is text in the body tag
<br>
<h1>This is text in the h1 tag</h1>
<p>This is text in the p tag</p>
There is more text in the body after the p tag
</body>
</html>
I'm looking to use CheerioJS, an HTML parser, to collect each HTML tag into an array for manipulation purposes.
The desired output would be an array of the following:
[html, head, title, /title, /head, body, br, h1, /h1, p, /p, /body, /html]
I've been looking at Cheerio's DOM object but I'm not sure if it's what I need.
Upvotes: 1
Views: 946
Reputation: 54984
You could do:
$('*').get().map(el => el.name)
// [ 'html', 'head', 'title', 'body', 'br', 'h1', 'p' ]
Note that closing tags aren't discrete nodes, they're part of the node that the opening tag belongs to.
Upvotes: 2
Reputation: 9873
I don't think you need an external library for this, you can walk the DOM yourself using a small function.
const list = [];
function walkTheDOM(node, iteratee) {
iteratee(node);
node = node.firstChild;
while (node) {
walkTheDOM(node, iteratee);
node = node.nextSibling;
}
}
walkTheDOM(document.getElementsByTagName('html')[0], function (node) {
list.push(node)
});
console.log(list);
// [html, head, text, meta, ...]
Here is a Fiddle.
Upvotes: 0