Reputation: 1543
I am scraping bing search results using node and cheerio. I need to grab all the href values from two lists that have different IDs.
$("a", ["#b_content", "#b_context"]).each((index, element) => { const href = $(element).attr("href"); links.push(href); });
Refer to the attached screenshot for the html.html
Update2 : I was wanting to ignore the whole <li class="b_pag">
tag, but the solutions I found here and elsewhere ignored just that tag. Any other <li>
tag under it, which has any other or no class, does not get ignored.
I found a way around it. I could grab the <li>
tags that have other class names. Check out the html here. I am thinking of using four different selectors for the first four classes. Like $(.b_algo)
or $(.b_ans)
. But how can I grab the other two <li>
tags that have multiple classes associated with it? I could not get a clear idea from the cheerio docs. Hope I am clear enough for you guys! Something like $(.b_ans b_mop)
didn't work. Nor did $("li[class=b_ans b_mop")
.
Upvotes: 3
Views: 3320
Reputation: 64
Try using Bing Web Search API instead: https://azure.microsoft.com/en-us/services/cognitive-services/bing-web-search-api/
It is the legal and better way to get Bing Search Results. You can sign up for free tier of this API, if you do not have lot of searches to do. You can also use the Azure free credit, that you receive when you join Azure.
Upvotes: 1
Reputation: 406
Try this one
$(".b_content li[class!='b_pag']").find("a").each((index, element) => {
const href = $(element).attr("href");
console.log(href);
});
if you want to ignore the class use the attribute selector with respective tag like this li[class!='b_pag']
Upvotes: 1
Reputation: 356
Try this,
$("#b_content", "#b_context").each(function(i, elem) {
array[i] = {
a: $(this).find("a").attr("href")
};
});`
To select "li" except class "b_pag" use, li:not( .b_pag )
Upvotes: 3