Shell script to extract domain extension from a list of domainnames

Question

I have a list of URLS (including http://), where some are just domain names and some others include full path.

How could I programmatically using shell scripting, extract the extension (.com, .net...), taking in consideration that some extensions are .co.uk for example?

Shizzmo · Accepted Answer

Essentially you'd need a list of everything you're considering a "TLD" There are a finite number of these. Then for each URL, you'd see if anything in your list matches that URL, and if so, print it out. The reason you need to construct the list yourself is that .co.uk is not a TLD. .uk is the TLD and .co is a subdomain.

Or you could construct an enormously long regex (for example, extracting .co.uk, .com, .ca, .biz):

$ perl -ne 'next unless /^http://[^ /?]+(\.com|\.co\.uk|\.ca|\.biz)/; print $1, "
"'

Shell script to extract domain extension from a list of domainnames

Answers (2)

Related Questions