Andy K
Andy K

Reputation: 5044

Doing a whois with a list of domain names

I've got a file of domain names e.g. equivalent to 2500.

I would like to do a whois on these domain names.

Upvotes: 5

Views: 18079

Answers (6)

nitnit
nitnit

Reputation: 31

You can do that with a simple "one liner" with the command xargs.

xargs -n 1 -a valid_dns.txt -I {} sh -c 'echo "Domain: {}"; whois {}'

Upvotes: 2

Fahad Anyit
Fahad Anyit

Reputation: 11

You have three options with WhoisFreaks for obtaining WHOIS data:

  1. WHOIS Lookup API enables you to request WHOIS data for individual domains with a one-domain-per-request. The response can be obtained in JSON or XML format.

    https://api.whoisfreaks.com/v1.0/whois?apiKey=API_KEY&whois=live&domainName=whoisfreaks.com

  2. Bulk WHOIS Lookup API streamlines your WHOIS data retrieval by allowing you to query up to 100 domains in a single request. You can choose between JSON or XML format for the response.

    https://api.whoisfreaks.com/v1.0/bulkwhois?apiKey=API_KEY
    In the request body, add domain names in JSON format.

    { "domainNames":[ "google.com", "abc.com", "google.uk", "google.us" ] }

  3. Bulk Whois checker web interface offers a convenient notification feature via email. After you upload a plain text or CSV file containing one domain per line, it will notify you and provide output files in CSV format. This interface supports the lookup of a minimum of 101 domains and a maximum of 3 million domains.

Upvotes: 1

egemen
egemen

Reputation: 19

Download and install Microsoft's whois tool from http://technet.microsoft.com/en-us/sysinternals/bb897435.aspx

Create a text file with the list of domain names, with a header row.

name
google.com
yahoo.com
stackoverflow.com

Create a powershell script:

$domainname = Import-Csv -Path "C:\domains.txt"
foreach($domain in $domainname) 
{
   .\whois.exe $domain.name Export-Csv -Path "C:\domain-info.csv" -Append
}

Run the powershell script.

Upvotes: 1

Alex Riley
Alex Riley

Reputation: 176730

It looks like you've had some helpful answers already, but I thought it might be good to say a little more about the challenges of doing WHOIS lookups in bulk (and in general) and provide some alternative solutions.

The WHOIS lookup

Looking up a single domain name typically involves finding the relevant WHOIS server for that domain and then requesting the information via port 43. If you have access to a unix-like shell (e.g. Bash), you can use whois to do this easily (as noted by others):

$ whois example.com

Very similar WHOIS tools have also been made available as modules for a vast array of programming languages. The pywhois module for Python is one example.

In its simplest form, a bulk WHOIS lookup is just looping over a list of domains, issuing a whois request for each domain and writing the record to an output.

Here is an example in Bash that reads domains from a file domains.txt and writes each WHOIS record into separate files (if you're using Windows, give Cygwin a try).

#!/bin/bash

domain_list="domains.txt"

while read line 
do
    name=$line
    echo "Looking up ${line}..."
    whois $name > ${line}.txt
    sleep 1
done < $domain_list

Beware of the following complications of WHOIS lookups in bulk:

  • Some WHOIS servers may not give you a full WHOIS record. This is especially true for country-specific domains (such as .de and .fr) and domains registered with certain registrars (such as GoDaddy).

    If you want the fullest possible record, you'll often have to go to the registry's website or to a third-party service which may have cached the record (e.g. DomainTools). This is much more difficult to automate and may have to be done manually. Even then, the record may not contain what you want (e.g. contact details for the registrant).

  • Some WHOIS servers impose restrictions on the number of requests you can make in a certain time frame. If you hit the limit, you might find that you have to return a few hours later to request the records again. For example, with .org domains, you limited to no more than three lookups in a minute and a few registrars will bar you for 24 hours.

    It's best to pause for a few seconds between lookups, or try to shuffle your list of domains by TLD so you don't bother the same server too many times in quick succession.

  • Some WHOIS servers are frequently down and the request will time out, meaning that you might need to go back and re-do these lookups. ICANN stipulates that whois servers must have a pretty decent uptime, but I've found one or two servers that are terrible at giving out records.

Parsing the record

Parsing WHOIS records (e.g. for registrant contact information) can be a challenge because:

  • The records are not always in a consistent format. You'll find this with the .com domains in particular. A .com record might be held any one of thousands of registrars worldwide (not by the .com registry, Verisign) and not all choose to present the records in an easy-to-parse format recommended by ICANN.

  • Again, the information you want to extract might not be in the record you get back from the lookup.

Since it's been mentioned already, pywhois is one option to parse WHOIS data. Here's a very simple Python script which looks up the WHOIS record for each domain and extracts the registrant name (where possible*), writing the results to a CSV file. You can include other fields too if you like:

import whois
import csv

with open("domains.txt", "r") as f:
    domains = f.readlines()
    
with open("output_file.csv", "wb") as csvfile:
    writer = csv.writer(csvfile)
    writer.writerow(["Domain", "Registrant Name"])
    for domain in domains:
        domain = domain.rstrip()
        record = whois.whois(domain)
        try:
            r_name = record.registrant_name
        except AttributeError:
            r_name = "error"
        writer.writerow([domain, r_name])

* When I tested this script quickly, pywhois wasn't very reliable in extracting the registrant name. Another similar library you could try instead is pythonwhois.

Upvotes: 4

Martin Ogden
Martin Ogden

Reputation: 882

Assuming the domains are in a file named domains.txt and you have pywhois installed, then something like this should do the trick:

import whois

infile = "domains.txt"

# get domains from file
with open(infile, 'rb') as f:
    domains = [line.rstrip() for line in f if line.rstrip()]

for domain in domains:
    print domain
    record = whois.whois(domain)

    # write each whois record to a file {domain}.txt
    with open("%s.txt" % domain, 'wb') as f:
        f.write(record.text)

This will output each whois record to a file named {domain}.txt


Without pywhois:

import subprocess

infile = "domains.txt"

# get domains from file
with open(infile, 'rb') as f:
    domains = [line.rstrip() for line in f if line.rstrip()]

for domain in domains:
    print domain
    record = subprocess.check_output(["whois", domain])

    # write each whois record to a file {domain}.txt
    with open("%s.txt" % domain, 'wb') as f:
        f.write(record)

Upvotes: 3

Alu
Alu

Reputation: 737

You can also use the Linux commandtool whois. The following code opens a subprocess and searches for the domain.

But you have to be carefull with many requests in short time. The servers will eventually block you after a while. ;)

import subprocess

def find_whois(domain):
    # Linux 'whois' command wrapper
    # 
    # Executes a whois lookup with the linux command whois.
    # Returncodes from: https://github.com/rfc1036/whois/blob/master/whois.c

    domain = domain.lower().strip()
    d = domain.split('.')
    if d[0] == 'www': d = d[1:]

    # Run command with timeout
    proc = subprocess.Popen(['whois', domain], stderr=subprocess.PIPE, stdout=subprocess.PIPE)
    ans,err = proc.communicate(input)

    if err == 1: raise WhoisError('No Whois Server for this TLD or wrong query syntax') 
    elif err == 2: raise WhoisError('Whois has timed out after ' + str(whois_timeout) + ' seconds. (try again later or try higher timeout)')
    ans = ans.decode('UTF-8')
    return ans


with open('domains.txt') as input:
    with open('out.txt','a') as output:
        for line in input:
            output.write(find_whois(line))

The with open as statement handles the filestream. The 'a' at the output file means the file is opened in append-mode.

Upvotes: 9

Related Questions