user2095363
user2095363

Reputation:

Find out first-guess fully-qualified URL for domain name using Node/JavaScript

for input domains like

web.whatsapp.com
facebook.com
electron.atom.io

I want to find out the fully qualified URL, that would pop up, when running it through chrome navigation bar or google search. So the output would be

https://www.facebook.com
https://web.whatsapp.com/
http://electron.atom.io/

The solution would be able to find out at least the protocol and - like the facebook example above - the path to the first-best domain. I tried google custom search API (not free) and the basic http/https objects of node, that just don't accept domain-only.

Any help is appreciated!

Upvotes: 0

Views: 249

Answers (2)

user2095363
user2095363

Reputation:

Okay, until something better comes around, I'll work with the free duckduckgo API and use the URL from the first search result.

var lookupFullyQualifiedURL = function ( urlIn, callback ) {
    if(!(typeof callback === "function")) {
        callback = function() {};
    }
    var request = require("request");
    if (urlIn == undefined )
        return urlIn;
    const srcUrl = "https://duckduckgo.com/?q=" + urlIn + "&format=json";
    request(srcUrl, function (error, response, body) {
        if (!error && response.statusCode == 200) {

            var json = JSON.parse(body);
            try {
                var urlOut = json.Results[0].FirstURL;
                callback(urlIn, urlOut);
            } catch (err) {
                callback(urlIn, undefined);
            }
        }
    })
}

var callback = function ( input, output ) {
    console.log(input + " >> " + output);
}

lookupFullyQualifiedURL("facebook", callback);
lookupFullyQualifiedURL("facebook.com", callback);
lookupFullyQualifiedURL("github", callback);
lookupFullyQualifiedURL("trello.com", callback);
lookupFullyQualifiedURL("whatsapp", callback);
lookupFullyQualifiedURL("web.whatsapp.com", callback);
lookupFullyQualifiedURL("whatsapp", callback);
lookupFullyQualifiedURL("spotify", callback);

The output is something like this:

web.whatsapp.com >> undefined
whatsapp >> https://www.whatsapp.com/
whatsapp >> https://www.whatsapp.com/
trello.com >> https://trello.com
github >> https://github.com/
spotify >> https://www.spotify.com
facebook >> https://www.facebook.com/
facebook.com >> https://www.facebook.com/

There is still room for improvement. E.g., for web.whatsapp.com nothing is returned. That's due to some API limitations of DuckDuckGo.

Upvotes: 1

Paul
Paul

Reputation: 498

NodeJS has a module called dns that can resolve almost any mal/half-formatted link:

For example resolve4() will resolve the dns into a ipv4:

const dns = require('dns');

dns.resolve4('nodejs.org', (err, addresses) => {
  if (err) throw err;

  console.log(`addresses: ${JSON.stringify(addresses)}`);

  addresses.forEach((a) => {
    dns.reverse(a, (err, hostnames) => {
      if (err) {
        throw err;
      }
      console.log(`reverse for ${a}: ${JSON.stringify(hostnames)}`);
    });
  });
});

There is also lookup(hostname[, options], callback) and dns.resolve(hostname[, rrtype], callback)

One of these should be a valid solution for you.

The documentation I am referring to:

DNS Documentation in NodeJS

Upvotes: 1

Related Questions