Jose Bernhardt
Jose Bernhardt

Reputation: 887

Returning HTML body from Nightmare.js

I'm currently working on some scraping with cheerio and nightmare. The reason why I'm using both and not just cheerio is because I have to manipulate the site to get to the part that I want to scrape and I found nightmare very good at doing those scripts.

So, right now I'm using nightmare to get until the part that the info that I need is displayed. After that, on the evaluate() I'm trying to somehow return the current html to then pass it to cheerio to do the scrape. The problem is that I don't know how to retrieve the html from the document object. Is there is a property from the document thats returns the full body?

Here is what I'm trying to do:

var Nightmare = require('nightmare');
var nightmare = Nightmare({show:true})
var express = require('express');
var fs = require('fs');
var request = require('request');
var cheerio = require('cheerio');
var app     = express();

var urlWeb = "url";
var selectCity = "#ddl_city"

nightmare
.goto(urlWeb)
.wait(selectCity)
.select('#ddl_city', '19')
.wait(6000)
.select('#ddl_theater', '12')
.wait(1000)
.click('#btn_enter')
.wait('#aspnetForm')
.evaluate(function(){

    //here is where I want to return the html body
    return document.html;


})
.then(function(body){
//loading html body to cheerio
    var $ = cheerio.load(body);
    console.log(body);
})

Upvotes: 7

Views: 6414

Answers (1)

Jose Bernhardt
Jose Bernhardt

Reputation: 887

With this worked:

document.body.innerHTML

Upvotes: 10

Related Questions