Reputation: 15
I have this simple bash script that I'm thinking about incorporating into my python project as I couldn't figure out a graceful way to do this in python3 compared to this single bash oneliner. Is there a better way to do this in python3 or a library that would assist with storing all legitimate unique hostnames in a list or dictionary?
I've tried doing something along the lines of,
test = []
try:
with open("dig.log", "r") as d:
for line in d:
parsed_lines = line.rstrip()
if not parsed_lines.startswith(";"):
test.append(parsed_lines.split())
except FileNotFoundError as fnf_error:
print(fnf_error)
which outputs,
10.10.10.10.in-addr.arpa. 604800 IN PTR ns1.yowhat.sup
10.10.10.in-addr.arpa. 604800 IN NS ns1.yowhat.sup
ns1.yowhat.sup. 604800 IN A 10.10.10.10
with a bunch of blank lines. I couldn't figure out how to gracefully strip() all the blank lines and only return the unique hostnames in python. I can get the exact functionality with a single bash oneliner as follows:
grep -v ";" $dig_file | sed 's/\.$//g' | sed -r '/^\s*$/d' | sed -n -e 's/^.*PTR\t//p; s/^.*NS\t//p; s/^.*MX\t//p; s/^.*CNAME\t//p; s/^.*TXT\t//p' | sort -u >$output_file_name
Which will output,
ns1.yowhat.sup
to a file. The helper bashscript that i'm using in my python program is,
#!/usr/bin/env bash
dig_file=$1
output_file_name=$2
NICE='\e[1;32;92m[+]\e[0m'
parse_dig() {
echo -e "${NICE} parsing dig queries to find hostnames ya dig?"
grep -v ";" $dig_file | sed 's/\.$//g' | sed -r '/^\s*$/d' | sed -n -e 's/^.*PTR\t//p; s/^.*NS\t//p; s/^.*MX\t//p; s/^.*CNMAE\t//p; s/^.*TXT\t//p' | sort -u >$output_file_name
}
parse_dig
Which i would then call in my python project doing something like,
subprocess.call("./parse_dig dig.log host_names.log", shell=True)
How can I do what my simple bash script helper script does in python3 so as not to require using a bunch of bash scripts to parse output from files? Would it make more sense to not use
subprocess.call("dig command | tee dig.log" , shell=True)
and do something like,
dig_output = subprocess.check_call("dig command...", shell=True, STDERR=subprocess.STDOUT)
and then somehow parse the dig_output in python or what would be the most elegant, pythonic, ideal way to do this in python3?
Upvotes: 1
Views: 246
Reputation: 23508
You'll need this to run dig
from python and catch the output:
from subprocess import PIPE, Popen
def cmdline(command):
process = Popen(
args=command,
stdout=PIPE,
shell=True
)
return process.communicate()[0]
After that things get quite easy:
>>> dig_output = [i.strip() for i in cmdline( 'dig google.com ns' ).split('\n')]
>>> dig_filtered = [i.split() for i in dig_output if len(i) > 10]
>>> domains = [i[-1] for i in dig_filtered if i[-2] in ['PTR', 'MS', 'NS', 'CNAME', 'TXT']]
>>> domains
['ns1.google.com.', 'ns2.google.com.', 'ns4.google.com.', 'ns3.google.com.']
>>>
Upvotes: 4