Paul Wratt
Paul Wratt

Reputation: 161

How to find what a specific TLD is?

I am looking for a way to find out what is the ccTLD for a country.

I have a way to check if a TLD or ccTLD is real, but that sourced public info is a list, not a reference to what country they are connected to, or a description.

For example I was looking for ccTLD for Estonia. I already know .es is Espaneol (Spain). But it turns out that .et is Ethiopia. Luckily the above mentioned check can do a starts with , which gives me a list:

$ find-tld.sh e
EC
EE
EG
ER
ES
ET
EU

Upvotes: 0

Views: 144

Answers (1)

Paul Wratt
Paul Wratt

Reputation: 161

There may be a list somewhere on the internet that is ccTLD connected to country, or you could make that list yourself (but there are a lot of them - essentially you could use the example below to help create that list).

In the meantime, if you have a reduced set of verified ccTLD's that you want check against (like the list in the OP) then this will work:

TLD=au curl -r 0-1024 -s https://en.wikipedia.org/wiki/.$TLD | \
grep -m1 "<b>\.$TLD</b>" | \
sed 's/<[^>]*>//g' | sed 's/\&\#91\;.\&\#93\;//g'

I adapted the above into a script, so that it does not ouptut a lot of (CSS) garbage when you lookup an invalid ccTLD.

You might be able to pull "for country" from the results, but thats quite complex when you check the results for .ac against the results from above list (some of which are ultra simple).


Explanation:

TLD= sets the valid Top Level Domain or Country Code you want information on, the results of which are taken from Wikipedia.

-r 0-1024 only collects the 1st 1Kbyte of HTML via curl, as the 1st grep value (line) when "TLD is a false entry" is always over this value (1063 to be precise), while "TLD is a true entry" is always between 600 and 850 bytes, and will only return the line bold with the TLD name we are looking for, after which grep stops processing input (-m 1).

From the that 1st line returned by grep, the 1st sed strips out HTML tags, while the 2nd sed removes the HTML entities used to create citation references on the Wikipedia web site, eg. [2]

The result is a paragraph of human readable text related to the Top Level Domain you are looking up. This is a reference, as opposed to computer readable text which could easily extract the actual country being referenced.

Upvotes: 0

Related Questions