What would be an optimal way to perform DNS query from bash in python3?

Question

I have this simple bash script that I'm thinking about incorporating into my python project as I couldn't figure out a graceful way to do this in python3 compared to this single bash oneliner. Is there a better way to do this in python3 or a library that would assist with storing all legitimate unique hostnames in a list or dictionary?

I've tried doing something along the lines of,

test = []
try:
    with open("dig.log", "r") as d:
        for line in d:
            parsed_lines = line.rstrip()
            if not parsed_lines.startswith(";"):
                test.append(parsed_lines.split())
except FileNotFoundError as fnf_error:
    print(fnf_error)

which outputs,


10.10.10.10.in-addr.arpa. 604800 IN     PTR     ns1.yowhat.sup

10.10.10.in-addr.arpa.  604800  IN      NS      ns1.yowhat.sup

ns1.yowhat.sup.         604800  IN      A       10.10.10.10

with a bunch of blank lines. I couldn't figure out how to gracefully strip() all the blank lines and only return the unique hostnames in python. I can get the exact functionality with a single bash oneliner as follows:

grep -v ";" $dig_file | sed 's/\.$//g' | sed -r '/^\s*$/d' | sed -n -e 's/^.*PTR	//p; s/^.*NS	//p; s/^.*MX	//p; s/^.*CNAME	//p; s/^.*TXT	//p' | sort -u >$output_file_name

Which will output,

ns1.yowhat.sup

to a file. The helper bashscript that i'm using in my python program is,

#!/usr/bin/env bash

dig_file=$1
output_file_name=$2
NICE='\e[1;32;92m[+]\e[0m'

parse_dig() {
    echo -e "${NICE} parsing dig queries to find hostnames ya dig?"
    grep -v ";" $dig_file | sed 's/\.$//g' | sed -r '/^\s*$/d' | sed -n -e 's/^.*PTR	//p; s/^.*NS	//p; s/^.*MX	//p; s/^.*CNMAE	//p; s/^.*TXT	//p' | sort -u >$output_file_name
}
parse_dig

Which i would then call in my python project doing something like,

subprocess.call("./parse_dig dig.log host_names.log", shell=True)

How can I do what my simple bash script helper script does in python3 so as not to require using a bunch of bash scripts to parse output from files? Would it make more sense to not use

subprocess.call("dig command | tee dig.log" , shell=True)

and do something like,

dig_output = subprocess.check_call("dig command...", shell=True, STDERR=subprocess.STDOUT)

and then somehow parse the dig_output in python or what would be the most elegant, pythonic, ideal way to do this in python3?

lenik · Accepted Answer

You'll need this to run dig from python and catch the output:

from subprocess import PIPE, Popen

def cmdline(command):
    process = Popen(
        args=command,
        stdout=PIPE,
        shell=True
    )
    return process.communicate()[0]

After that things get quite easy:

>>> dig_output = [i.strip() for i in cmdline( 'dig google.com ns' ).split('
')] 
>>> dig_filtered = [i.split() for i in dig_output if len(i) > 10]
>>> domains = [i[-1] for i in dig_filtered if i[-2] in ['PTR', 'MS', 'NS', 'CNAME', 'TXT']]
>>> domains
['ns1.google.com.', 'ns2.google.com.', 'ns4.google.com.', 'ns3.google.com.']
>>>

What would be an optimal way to perform DNS query from bash in python3?

Answers (1)

Related Questions