Reputation: 623
So I have this python code that I'm trying to convert to node.js, but I am not sure how.
import urllib.request, re
def getDef(word):
link = "http://www.merriam-webster.com/dictionary/%s" % word
data = urllib.request.urlopen(link).read().decode()
try:
return re.search("<p>: (.*?)</p><p>", data).group(1)
except:
return "No match"
class newDefinition:
def __init__(self, word):
self.definition = getDef(word);
>>> definition = newDefintion("color")
>>> print(definition.definition)
a quality such as red, blue, green, yellow, etc., that you see when you look at something
In node.js however though it I can seem to return it like in python because of it's callback way of doing things, or at least I can't seem to return it which is why I'm asking how would I do the node.js equivalent or is their no equivalent? Here is what I have so far maybe you can spot what I'm doing wrong and how to fix it
var urllib = require("urllib"); // installed with npm
var getDef = function(word){
var link = "http://www.merriam-webster.com/dictionary/" + word;
var urlData = urllib.request(link, {}, function(err, data, res){
var re = new RegExp("<p>: (.*?)</p><p>");
var results = data.toString();
var match = re.exec(results)[1];
return match; // Expected it to give urlData the definition
});
return urlData;
}
var Definition = function(word){
this.definition = getDef(word);
}
definition = new Definition("color");
console.log(definition.definition); // this won't give the definition but the information of the urllib itself rather.
So in general trying to figure out is how to use asynchronous code so I can return things that I need, but I am not use to this concept either so is there an equivalent to this in python? Also if you can point me to some good documentation on asynchronous code that would be great also.
Upvotes: 0
Views: 670
Reputation: 2443
Here is my two cent worth suggestion.
Never ever use regular expressions to parse HTML (Refer here for more details), instead use the XPath like library to parse the document. You can use libraries like cheerio or phantomjs.
Here is a clean solution.
var request = require('request'),
when = require('when'),
cheerio = require('cheerio');
var URL = 'http://www.merriam-webster.com/dictionary/';
/**
* @param word: Word to search the dictionary
* @returns
* Promise object which resolves to array of
* definitions of the word
*/
var getDef = function(word){
var defer = when.defer();
request(URL + word, function(err, res, body){
if (err || res.statusCode !== 200){
defer.reject();
}
var defs = [];
var $ = cheerio.load(body);
$('.wordclick .headword:first-child p').each(function(i,ele){
var definition = $(ele).text();
defs.push(definition);
});
defer.resolve(defs);
});
return defer.promise;
}
getDef('happy').then(function(words){
console.log(words);
});
Note: Here I am using when (a Promise+ library) instead of the Node's standard CPS style.
Upvotes: 1
Reputation: 91609
Since return
will actually just exit your function instead of returning a value, you need to use a callback. It would look like this:
var urllib = require("urllib");
var getDef = function(word, callback){
var link = 'http://www.merriam-webster.com/dictionary/' + word;
urllib.request(link, {}, function(err, data, res) {
var re = new RegExp('<p>: (.*?)</p><p>');
var results = data.toString();
var match = re.exec(results)[1];
callback(match);
});
};
Then you would pass a callback while calling the function:
getDef('color', function(definition) {
console.log(definition);
});
Edit: Setting an object's property has the same idea. It might look like this instead:
var Definition = function(word) {
var self = this;
getDef(world, function(definition, callback) {
self.definition = definition;
callback.call(self);
});
};
And would be called like so:
var definition = new Definition('color', function() {
console.log(definition.definition);
});
Upvotes: 2