Reputation: 1235
I am learning to use request and cheerio to parse a simple html file. However, in the page there is many script tag and inside them reside the actual data. For example like
<script> var data = {"name":"John","age":33} </script>
So naturally the thing that is interesting is the "data" variable. Is there a more natural way then doing regex to get that data?
Upvotes: 1
Views: 1574
Reputation: 71
With the new version jsdom(v16.4.0, nodejs 12.6.0), jsdom.jsdom doesnt exist, we can use new JSDOM like below:
const jsdom = require("jsdom");
const { JSDOM } = jsdom;
const dom = new JSDOM(`<script> var foo = "bar" </script>`, { runScripts: "dangerously" });
console.log(dom.window.foo); // output is: bar
Upvotes: 2
Reputation: 34313
I don't believe cheerio supports parsing inline scripts. However you can use jsdom for your use case
var jsdom = require('jsdom')
var html = '<script>var data = {"name":"John","age":33} </script>'
jsdom.defaultDocumentFeatures = {
FetchExternalResources: ['script'],
ProcessExternalResources: ['script'],
MutationEvents: '2.0',
QuerySelector: false
}
var document = jsdom.jsdom(html)
var window = document.createWindow()
console.dir(window.data)
Upvotes: 0