Reputation:
for input domains like
web.whatsapp.com
facebook.com
electron.atom.io
I want to find out the fully qualified URL, that would pop up, when running it through chrome navigation bar or google search. So the output would be
https://www.facebook.com
https://web.whatsapp.com/
http://electron.atom.io/
The solution would be able to find out at least the protocol and - like the facebook example above - the path to the first-best domain. I tried google custom search API (not free) and the basic http/https objects of node, that just don't accept domain-only.
Any help is appreciated!
Upvotes: 0
Views: 249
Reputation:
Okay, until something better comes around, I'll work with the free duckduckgo API and use the URL from the first search result.
var lookupFullyQualifiedURL = function ( urlIn, callback ) {
if(!(typeof callback === "function")) {
callback = function() {};
}
var request = require("request");
if (urlIn == undefined )
return urlIn;
const srcUrl = "https://duckduckgo.com/?q=" + urlIn + "&format=json";
request(srcUrl, function (error, response, body) {
if (!error && response.statusCode == 200) {
var json = JSON.parse(body);
try {
var urlOut = json.Results[0].FirstURL;
callback(urlIn, urlOut);
} catch (err) {
callback(urlIn, undefined);
}
}
})
}
var callback = function ( input, output ) {
console.log(input + " >> " + output);
}
lookupFullyQualifiedURL("facebook", callback);
lookupFullyQualifiedURL("facebook.com", callback);
lookupFullyQualifiedURL("github", callback);
lookupFullyQualifiedURL("trello.com", callback);
lookupFullyQualifiedURL("whatsapp", callback);
lookupFullyQualifiedURL("web.whatsapp.com", callback);
lookupFullyQualifiedURL("whatsapp", callback);
lookupFullyQualifiedURL("spotify", callback);
The output is something like this:
web.whatsapp.com >> undefined
whatsapp >> https://www.whatsapp.com/
whatsapp >> https://www.whatsapp.com/
trello.com >> https://trello.com
github >> https://github.com/
spotify >> https://www.spotify.com
facebook >> https://www.facebook.com/
facebook.com >> https://www.facebook.com/
There is still room for improvement. E.g., for web.whatsapp.com nothing is returned. That's due to some API limitations of DuckDuckGo.
Upvotes: 1
Reputation: 498
NodeJS has a module called dns
that can resolve almost any mal/half-formatted link:
For example resolve4()
will resolve the dns into a ipv4:
const dns = require('dns');
dns.resolve4('nodejs.org', (err, addresses) => {
if (err) throw err;
console.log(`addresses: ${JSON.stringify(addresses)}`);
addresses.forEach((a) => {
dns.reverse(a, (err, hostnames) => {
if (err) {
throw err;
}
console.log(`reverse for ${a}: ${JSON.stringify(hostnames)}`);
});
});
});
There is also lookup(hostname[, options], callback)
and dns.resolve(hostname[, rrtype], callback)
One of these should be a valid solution for you.
The documentation I am referring to:
Upvotes: 1