Reputation: 27
I'm writing my own directory buster in python, and I'm testing it against a web server of mine in a safe and secure environment. This script basically tries to retrieve common directories from a given website and, looking at the HTTP status code of the response, it is able to determine if a page is accessible or not.
As a start, the script reads a file containing all the interesting directories to be looked up, and then requests are made, in the following way:
for dir in fileinput.input('utils/Directories_Common.wordlist'):
try:
conn = httplib.HTTPConnection(url)
conn.request("GET", "/"+str(dir))
toturl = 'http://'+url+'/'+str(dir)[:-1]
print ' Trying to get: '+toturl
r1 = conn.getresponse()
response = r1.read()
print ' ',r1.status, r1.reason
conn.close()
Then, the response is parsed and if a status code equal to "200" is returned, then the page is accessible. I've implemented all this in the following way:
if(r1.status == 200):
print '\n[!] Got it! The subdirectory '+str(dir)+' could be interesting..\n\n\n'
All seems fine to me except that the script marks as accessible pages that actually aren't. In fact, the algorithm collects the only pages that return a "200 OK", but when I manually surf to check those pages I found out they have been moved permanently or they have a restricted access. Something went wrong but I cannot spot where should I fix the code exactly, any help is appreciated..
Upvotes: 1
Views: 22073
Reputation: 75
I would be adviced you to use http://docs.python-requests.org/en/latest/# for http.
Upvotes: 1
Reputation: 8806
I did not found any problems with your code, except it is almost unreadable. I have rewritten it into this working snippet:
import httplib
host = 'www.google.com'
directories = ['aosicdjqwe0cd9qwe0d9q2we', 'reader', 'news']
for directory in directories:
conn = httplib.HTTPConnection(host)
conn.request('HEAD', '/' + directory)
url = 'http://{0}/{1}'.format(host, directory)
print ' Trying: {0}'.format(url)
response = conn.getresponse()
print ' Got: ', response.status, response.reason
conn.close()
if response.status == 200:
print ("[!] The subdirectory '{0}' "
"could be interesting.").format(directory)
Outputs:
$ python snippet.py
Trying: http://www.google.com/aosicdjqwe0cd9qwe0d9q2we
Got: 404 Not Found
Trying: http://www.google.com/reader
Got: 302 Moved Temporarily
Trying: http://www.google.com/news
Got: 200 OK
[!] The subdirectory 'news' could be interesting.
Also, I did use HEAD HTTP request instead of GET, as it is more efficient if you do not need the contents and you are interested only in the status code.
Upvotes: 2