Reputation: 3152
I'm using cheerio
to scrape
a website
. I want to select all element where the id starts with a certain value. But when I use the attributeStartsWith
like in jQuery
I get the malformed attribute selector
syntax error.
This you can do in jQuery
to select all div
elements starting with 'question-summary-'
$('div[id^="question-summary-"')
My node
code looks like this
const cheerio = require('cheerio')
const $ = cheerio.load('https://stackoverflow.com/')
console.log('text', $('div[id^="question-summary-"').text())
How can I accomplish this in cheerio
? Is there another way to do this?
Upvotes: 0
Views: 2466
Reputation: 616
I've been curious about your problem...
putting this simple code together, had no problem parsing the front page of stackoverflow...
const cheerio = require('cheerio')
const request = require('request')
try {
request('https://stackoverflow.com/', function (error, response, html) {
if (!error && response.statusCode == 200) {
var $ = cheerio.load(html);
$('[id|=question-summary]').each(function (i, element) {
console.log(element.text());
});
}
});
}
catch (e) {
console.log(e);
}
Upvotes: 0
Reputation: 3152
I now see that I have a typo and strangely it's fully excepted by jQuery. Fixed the typo and now it works. Cheerio was right, and jQuery should be more unforgivable.
Old selector
$('div[id^="question-summary-"')
New selector
$('div[id^="question-summary-"]')
Notice the bracket at the end.
Strangely the first selector is fully excepted by jQuery. To test the old selector go to stackoverflow.com, enter F12 and paste it into the console. You will see that both selectors are working.
Upvotes: 0
Reputation: 198
You have a syntax bug :)
Change
console.log('text', $("div[id^='question-summary-'").text())
to
console.log('text', $("div[id^='question-summary-']").text())
Full Code
const $ = cheerio.load('https://stackoverflow.com/')
console.log('text', $("div[id^='question-summary-']").text());
Cheers
Upvotes: 5
Reputation: 616
have you tried using
$("[id|='question-summary']")
instead ?
as |= is looking with what is inside the '' followed by an hyphen.
Upvotes: 1