Nguyen Hoang
Nguyen Hoang

Reputation: 548

Scrape paginate using nodejs, cheerio

How can I scrape data from a pagination ?

My code is work well with one pages, but I need to scrap all data from page 2, page 3 ... and push into an ebooks array.

Here is my code

function searchEbooks(query) {
    return fetch(getUrl(1, query))
        .then(res => res.text())
        .then(body => {
            const ebooks = [];    
            $('article').each(function(i, element) {
                const $element = $(element);
                const $title = $element.find('.entry-title a');
                const $image = $element.find('.attachment-post-thumbnail');
                const $description = $element.find('.entry-summary');
                const authors = [];
                $(element).find('.entry-author a').each(function(i, element) {
                    author = $(element).text();
                    authors.push(author);
                });
                const ebook = {
                    image: $image.attr('src'),
                    title: $title.text(),
                    description: $description.text(),
                    authors: authors,
                }
                ebooks.push(ebook);
            });
            return ebooks;
        });
}

I have no idea how to do this. Please give me a hint or an example.

I use cherrio, node-fetch packages.

Thank you.

Upvotes: 2

Views: 1342

Answers (1)

hong4rc
hong4rc

Reputation: 4103

Try this to get next url:

var href = $('.current+a').attr('href');

if(href){
    // you can check this url
} else {
    console.log('You get all page');
}

Upvotes: 2

Related Questions