Reputation: 2385
Id like to extract 2 elements from each li
inside this unordered list:
<ul class="cookieAlertList padTop10">
<li>
<img src="images/cookieradar/iconHot.gif" />
<div class="cookieAlertDesc">
<a href="/P.aspx?p=16aa6d76104">
Peanut Butter Chocolate Chunk
</a>
<br />
<small>44 mins ago</small>
</div>
</li>
<li>
<img src="images/cookieradar/iconHot.gif" />
<div class="cookieAlertDesc">
<a href="/P.aspx?p=15936a56102">
Oatmeal Wheatgerm Chocolate Chip
</a>
<br />
<small>48 mins ago</small>
</div>
</li>
</ul>
For each of those list items, id like to extract the cookie name (contained in the element) and the time, contained in the element.
I was able to extract the 2 list items using:
var li = $('.cookieAlertList').find('li');
but not sure how to proceed.
Upvotes: 3
Views: 2853
Reputation: 56965
Here's an alternative to the existing answer, using more precise selectors, spread syntax, map
and .trim()
:
const cheerio = require("cheerio"); // ^1.0.0-rc.12
const html = `<HTML as above>`;
const $ = cheerio.load(html);
const result = [...$(".cookieAlertList li")].map(e => ({
name: $(e).find(".cookieAlertDesc a").text().trim(),
time: $(e).find(".cookieAlertDesc small").text().trim(),
}));
console.log(result);
Output:
[
{ name: 'Peanut Butter Chocolate Chunk', time: '44 mins ago' },
{ name: 'Oatmeal Wheatgerm Chocolate Chip', time: '48 mins ago' }
]
Upvotes: 0
Reputation: 1129
Like this:
var cheerio = require('cheerio');
// some HTTP Requests to scrape the page content..
var $ = cheerio.load(html);
var result = [];
$('ul.cookieAlertList li').each(function(el) {
var $div = $(el).find('div.cookieAlertDesc');
var obj = {
cookieName: $div.find('a').text(),
time: $div.find('small').text()
};
result.push(obj);
});
console.log('result', result); // JSON.stringify(result, null, 3);
Upvotes: 3