Reputation: 15729
I think this may be just basic syntax. I'm coming from Java and very new to Javascript. For example, when I see a $ in all the examples, my mind goes blank.
Code for parsing the HTTP request (which contains a bunch of dog shows) looks like (using the request library):
function parseRequest1(error, response, body) {
// TODO should check for error...
var Cheerio = require('cheerio');
parser = Cheerio.load(body);
var table2 = parser('.qs_table[bgcolor="#71828A"]');
var showList = [];
// skip over a bunch of crap to find the table. Each row with this BG color represents a dog show
var trows = parser('tr[bgcolor="#FFFFFF"]', table2);
trows.each(function(i, tablerow) {
var show = parseShow(tablerow);
if (show) // returns a null if something went wrong
showList.push(show);
});
// then do something with showList...
}
which is called by
Request.get(URL, parseRequest1);
So far, so good. Where I'm stuck is in how to write the parseShow function. I'd like to go something like
function parseShow(tableRow) {
var tds = parser('td', tableRow);
//and then go through the tds scraping info...
}
but I get an error:
TypeError: Object #<Object> has no method 'find'
at new module.exports (C:\Users\Morgan\WebstormProjects\agility\node_modules\cheerio\lib\cheerio.js:76:18)
at exports.load.initialize (C:\Users\Morgan\WebstormProjects\agility\node_modules\cheerio\lib\static.js:19:12)
at parseShow (C:\Users\Morgan\WebstormProjects\agility\routes\akc.js:20:15)
Looking at the stack trace, it looks like Cheerio is creating a new one. How am I supposed to pass the Cheerio parser down to the second function? Right now parser is a global var in the file.
I've tried a bunch of random things like these but they don't work either:
var tds = tableRow('td');
var tds = Cheerio('td', tableRow);
What I'm forced to do instead is a bunch of disgusting, fragile code accessing tableRow.children[1], tableRow.children[3]
, etc... (the HTML has /r/ns all over creation so many of the children are whitespace)
Upvotes: 1
Views: 2939
Reputation: 2866
I know what you mean about the $(..)
. The $ is just a function name. I think it was chosen as it's short and catches the eye.
Used with Cheerio, and more generally JQuery, it is used with css selectors:
var table2 = $('.qs_table[bgcolor="#71828A"]');
The advantage of this is that table2 is now a selector Object
and will have a .find()
method which can be called.
In Jquery (I'm not so sure about Cheerio), the selector Object
is also a collection, so multiple elements can be matched (or none).
The object model in javascript is a lot more dynamic than Java which can lead to much shorter - if more confusing code.
The code to parse table rows:
$('tr[bgcolor="#FFFFFF"]').each(function(i, tablerow) {
var show = tablerow.text();
if (show) // returns a null if something went wrong
showList.push(show);
});
In your code above parser(..)
is used rather than $(..)
. However once, the object has been loaded with the body you can just keep using it:
parser('tr[bgcolor="#FFFFFF"]').each(function(i, tablerow) {
or to just find the rows of the table you want the following:
parser('.qs_table[bgcolor="#71828A"] tr[bgcolor="#FFFFFF"]').each(function(i, tablerow) {
The selector is css so this will find all tr[bgcolor="#FFFFFF"]
elements which are children of the .qs_table[bg="#71828A']
element.
Upvotes: 2