Reputation: 1233
Python has a library called Beautiful Soup that you can use to parse an HTML tree without creating 'get' requests in external web pages. I'm looking for the same in JavaScript, but I've only found jsdom and JSSoup (which seems unused) and if I'm correct, they only allow you to make requests.
I want a library in JavaScript which allows me to parse the entire HTML tree without getting CORS policy errors, that is, without making a request, just parsing it.
How can I do this?
Upvotes: 11
Views: 19908
Reputation: 396
In a browser context, you can use DOMParser:
const html = "<h1>title</h1>";
const parser = new DOMParser();
const parsed = parser.parseFromString(html, "text/html");
console.log(parsed.firstChild.innerText); // "title"
and in node you can use node-html-parser:
import { parse } from 'node-html-parser';
const html = "<h1>title</h1>";
const parsed = parse(html);
console.log(parsed.firstChild.innerText); // "title"
Upvotes: 8