Tarabass
Tarabass

Reputation: 3152

Cheerio attributeStartsWith selector

I'm using cheerio to scrape a website. I want to select all element where the id starts with a certain value. But when I use the attributeStartsWith like in jQuery I get the malformed attribute selector syntax error.

This you can do in jQuery to select all div elements starting with 'question-summary-'

$('div[id^="question-summary-"')

My node code looks like this

const cheerio = require('cheerio')
const $ = cheerio.load('https://stackoverflow.com/')

console.log('text', $('div[id^="question-summary-"').text())

How can I accomplish this in cheerio? Is there another way to do this?

Upvotes: 0

Views: 2466

Answers (4)

Carl Verret
Carl Verret

Reputation: 616

I've been curious about your problem...

putting this simple code together, had no problem parsing the front page of stackoverflow...

const cheerio = require('cheerio')
const request = require('request')

try {


  request('https://stackoverflow.com/', function (error, response, html) {
    if (!error && response.statusCode == 200) {
      var $ = cheerio.load(html);

      $('[id|=question-summary]').each(function (i, element) {
        console.log(element.text());
      });
    }
  });

}
catch (e) {


  console.log(e);

}

Upvotes: 0

Tarabass
Tarabass

Reputation: 3152

I now see that I have a typo and strangely it's fully excepted by jQuery. Fixed the typo and now it works. Cheerio was right, and jQuery should be more unforgivable.

Old selector

$('div[id^="question-summary-"')

New selector

$('div[id^="question-summary-"]')

Notice the bracket at the end.

Strangely the first selector is fully excepted by jQuery. To test the old selector go to stackoverflow.com, enter F12 and paste it into the console. You will see that both selectors are working.

Upvotes: 0

b26
b26

Reputation: 198

You have a syntax bug :)

Change

console.log('text', $("div[id^='question-summary-'").text())

to

console.log('text', $("div[id^='question-summary-']").text())

Full Code

const $ = cheerio.load('https://stackoverflow.com/')

console.log('text', $("div[id^='question-summary-']").text());

Cheers

Upvotes: 5

Carl Verret
Carl Verret

Reputation: 616

have you tried using

$("[id|='question-summary']")

instead ?

as |= is looking with what is inside the '' followed by an hyphen.

Upvotes: 1

Related Questions