alexpfx
alexpfx

Reputation: 6700

Cheerio - Get correct text when selector returns more than one result

https://www.cbf.com.br/futebol-brasileiro/competicoes/campeonato-brasileiro-serie-a/2018/1?ref=botao

enter image description here

I would like to get the market text from the page above. ("Sábado, 14 de Abril de 2018" and "16:00").

I did this with kotlin and the jsoup library:

val date = select("div.col-sm-8 > span.text-2")[1] //Sábado, 14 de Abril de 2018
val time = select("div.col-sm-8 > span.text-2")[2] //16:00

This query div.col-sm-8 > span.text-2 returns an array and I simple get the right information using the index.

But due to other issues, I have to use javascript.

I tried to do the same thing using JavaScript and the Cherio library but it seems that it doesn't work the same way, even if both search mode are based in JQuery:

const scherio = require('cheerio');
const rp = require('request-promise');

/**
 * @type {string}
 */
const baseurl = "https://www.cbf.com.br/futebol-brasileiro/competicoes/campeonato-brasileiro-serie-a/2018/";

const turn = 190;

let totalGames = 1;
const gamesPerRound = 10;

module.exports =

    class FetchRoundsFromCbf {

        fetchRounds() {

            for (let i = 1; i <= totalGames; i++) {
                let url = baseurl.concat(i.toString());
                rp(url).then(function (html) {
                    const $ = scherio.load(html);


                    let date = $("div.col-sm-8 > span.text-2")[1];
                    let time = $("div.col-sm-8 > span.text-2")[2];


                     console.log(date.text());
                     console.log(time.text());
                });

            }
        }

    }

Gives me:

Unhandled rejection TypeError: date.text is not a function
    at /home/alexandre/dev/flutter/brasileiro-parser-js/network/fetchdata/FetchRoundsFromCbf.js:32:39
    at tryCatcher (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/util.js:16:23)
    at Promise._settlePromiseFromHandler (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/promise.js:512:31)
    at Promise._settlePromise (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/promise.js:569:18)
    at Promise._settlePromise0 (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/promise.js:614:10)
    at Promise._settlePromises (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/promise.js:694:18)
    at _drainQueueStep (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/async.js:138:12)
    at _drainQueue (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/async.js:131:9)
    at Async._drainQueues (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/async.js:147:5)
    at Immediate.Async.drainQueues [as _onImmediate] (/home/alexandre/dev/flutter/brasileiro-parser-js/node_modules/bluebird/js/release/async.js:17:14)
    at processImmediate (timers.js:637:19)

Then I print only the query result:

 console.log(date);
 console.log(time);

I receive:

{ type: 'tag',
  name: 'span',
  namespace: 'http://www.w3.org/1999/xhtml',
  attribs: [Object: null prototype] { class: 'text-2 p-r-20' },
  'x-attribsNamespace': [Object: null prototype] { class: undefined },
  'x-attribsPrefix': [Object: null prototype] { class: undefined },
  children:
   [ { type: 'tag',
       name: 'i',
       namespace: 'http://www.w3.org/1999/xhtml',
       attribs: [Object],
       'x-attribsNamespace': [Object],
       'x-attribsPrefix': [Object],
       children: [],
       parent: [Circular],
       prev: null,
       next: [Object] },
     { type: 'text',
       data: ' Sábado, 14 de Abril de 2018',
       parent: [Circular],
       prev: [Object],
       next: null } ],
  parent:
   { type: 'tag',
     name: 'div',
     namespace: 'http://www.w3.org/1999/xhtml',
     attribs: [Object: null prototype] { class: 'col-sm-8' },
     'x-attribsNamespace': [Object: null prototype] { class: undefined },
     'x-attribsPrefix': [Object: null prototype] { class: undefined },
     children:
      [ [Object],
        [Object],
        [Object],
        [Circular],
        [Object],
        [Object],
        [Object] ],
     parent:
      { type: 'tag',
        name: 'div',
        namespace: 'http://www.w3.org/1999/xhtml',
        attribs: [Object],
        'x-attribsNamespace': [Object],
        'x-attribsPrefix': [Object],
        children: [Array],
        parent: [Object],
        prev: [Object],
        next: [Object] },
     prev:
      { type: 'text',
        data: '\n            ',
        parent: [Object],
        prev: null,
        next: [Circular] },
     next:
      { type: 'text',
        data: '\n            ',
        parent: [Object],
        prev: [Circular],
        next: [Object] } },
  prev:
   { type: 'text',
     data: '\n                              ',
     parent:
      { type: 'tag',
        name: 'div',
        namespace: 'http://www.w3.org/1999/xhtml',
        attribs: [Object],
        'x-attribsNamespace': [Object],
        'x-attribsPrefix': [Object],
        children: [Array],
        parent: [Object],
        prev: [Object],
        next: [Object] },
     prev:
      { type: 'tag',
        name: 'span',
        namespace: 'http://www.w3.org/1999/xhtml',
        attribs: [Object],
        'x-attribsNamespace': [Object],
        'x-attribsPrefix': [Object],
        children: [Array],
        parent: [Object],
        prev: [Object],
        next: [Circular] },
     next: [Circular] },
  next:
   { type: 'text',
     data: '\n                ',
     parent:
      { type: 'tag',
        name: 'div',
        namespace: 'http://www.w3.org/1999/xhtml',
        attribs: [Object],
        'x-attribsNamespace': [Object],
        'x-attribsPrefix': [Object],
        children: [Array],
        parent: [Object],
        prev: [Object],
        next: [Object] },
     prev: [Circular],
     next:
      { type: 'tag',
        name: 'span',
        namespace: 'http://www.w3.org/1999/xhtml',
        attribs: [Object],
        'x-attribsNamespace': [Object],
        'x-attribsPrefix': [Object],
        children: [Array],
        parent: [Object],
        prev: [Circular],
        next: [Object] } } }
{ type: 'tag',
  name: 'span',
  namespace: 'http://www.w3.org/1999/xhtml',
  attribs: [Object: null prototype] { class: 'text-2 p-r-20' },
  'x-attribsNamespace': [Object: null prototype] { class: undefined },
  'x-attribsPrefix': [Object: null prototype] { class: undefined },
  children:
   [ { type: 'tag',
       name: 'i',
       namespace: 'http://www.w3.org/1999/xhtml',
       attribs: [Object],
       'x-attribsNamespace': [Object],
       'x-attribsPrefix': [Object],
       children: [],
       parent: [Circular],
       prev: null,
       next: [Object] },
     { type: 'text',
       data: ' 16:00',
       parent: [Circular],
       prev: [Object],
       next: null } ],
  parent:
   { type: 'tag',
     name: 'div',
     namespace: 'http://www.w3.org/1999/xhtml',
     attribs: [Object: null prototype] { class: 'col-sm-8' },
     'x-attribsNamespace': [Object: null prototype] { class: undefined },
     'x-attribsPrefix': [Object: null prototype] { class: undefined },
     children:
      [ [Object],
        [Object],
        [Object],
        [Object],
        [Object],
        [Circular],
        [Object] ],
     parent:
      { type: 'tag',
        name: 'div',
        namespace: 'http://www.w3.org/1999/xhtml',
        attribs: [Object],
        'x-attribsNamespace': [Object],
        'x-attribsPrefix': [Object],
        children: [Array],
        parent: [Object],
        prev: [Object],
        next: [Object] },
     prev:
      { type: 'text',
        data: '\n            ',
        parent: [Object],
        prev: null,
        next: [Circular] },
     next:
      { type: 'text',
        data: '\n            ',
        parent: [Object],
        prev: [Circular],
        next: [Object] } },
  prev:
   { type: 'text',
     data: '\n                ',
     parent:
      { type: 'tag',
        name: 'div',
        namespace: 'http://www.w3.org/1999/xhtml',
        attribs: [Object],
        'x-attribsNamespace': [Object],
        'x-attribsPrefix': [Object],
        children: [Array],
        parent: [Object],
        prev: [Object],
        next: [Object] },
     prev:
      { type: 'tag',
        name: 'span',
        namespace: 'http://www.w3.org/1999/xhtml',
        attribs: [Object],
        'x-attribsNamespace': [Object],
        'x-attribsPrefix': [Object],
        children: [Array],
        parent: [Object],
        prev: [Object],
        next: [Circular] },
     next: [Circular] },
  next:
   { type: 'text',
     data: '\n                          ',
     parent:
      { type: 'tag',
        name: 'div',
        namespace: 'http://www.w3.org/1999/xhtml',
        attribs: [Object],
        'x-attribsNamespace': [Object],
        'x-attribsPrefix': [Object],
        children: [Array],
        parent: [Object],
        prev: [Object],
        next: [Object] },
     prev: [Circular],
     next: null } }

I'm not very good at javascript, how would I do to retrieve the information I need?

Upvotes: 1

Views: 649

Answers (2)

Guillermo Guti&#233;rrez
Guillermo Guti&#233;rrez

Reputation: 17829

You can use eq() to get a Cheerio element by index, the same way as in jQuery.

let date = $("div.col-sm-8 > span.text-2").eq(1);
let time = $("div.col-sm-8 > span.text-2").eq(2);

eq() reduces the set of matched elements to the one at the specified index.

Upvotes: 1

alexpfx
alexpfx

Reputation: 6700

I managed what I want by using slice:

let date = $("div.col-sm-8").find("span").slice(1);                    
let time = $("div.col-sm-8").find("span").slice(2);

console.log(date.text());
console.log(time.text());

Upvotes: 0

Related Questions