Strider
Strider

Reputation: 61

Python: Access ftp like browsers do, with proxy

I want to access a ftp server, anonymous, just for download. My company have a proxy, and ftp ports (21) are blocked. I can't access the ftp server directly.

What I whant to do is to write some code that behaves exactly the same way browsers do. The idea is that, if I can download the files with my browser, there is way to do it with code.

My code works when I try to access a web site outside the company, but is still not working for ftp servers.

proxy = urllib2.ProxyHandler({'https': 'proxy.mycompanhy.com:8080',
                              'http': 'proxy.mycompanhy.com:80',
                              'ftp': 'proxy.mycompanhy.com:21' })
auth = urllib2.HTTPBasicAuthHandler()
opener = urllib2.build_opener(proxy, auth, urllib2.HTTPHandler)
urllib2.install_opener(opener)

urlAddress = 'https://python.org'
# urlAddress = 'ftp://ftp1.cptec.inpe.br'

conn = urllib2.urlopen(urlAddress)
return_str = conn.read()
print return_str    

When I try to access python.org, it works fine. If I remove the install_opener part, it does not work anymore, proving that the proxy is required. When I use the ftp url, it blocks (or timeout if I choose to use these parameters).

I understand that ftp and http are two very different protocols. What I don't understand is the mechanism that browsers use to access these ftp servers. I mean, I don't know if there is a layer on server side that interfaces between http and ftp, retriveing a html; or if browser, in some other maner, access the ftp and builds the page.

There also might be a confusion with the ftp domain (or the url) and the connection mode. It seems to me that when urllib2 reads the ftp://... it automatically uses the port 21.

Upvotes: 4

Views: 1803

Answers (1)

Strider
Strider

Reputation: 61

I found a solution using wget. This package handles with proxies, but documentation was very ubscure. You need to setup an environment variable with proxy name.

import wget
import os
import errno

# setup proxy
os.environ["ftp_proxy"] = "proxy.mycompanhy.com"
os.environ["http_proxy"] = "proxy.mycompanhy.com"
os.environ["https_proxy"] = "proxy.mycompanhy.com"

src = "http://domain.gov/data/fileToDownload.txt"
out = "C:\\outFolder\\outFileName.txt" # out is optional

# create output folder if it doesn't exists
outFolder, _ = os.path.split( out )
try:
    os.makedirs(outFolder)
except OSError as exc: # Python >2.5
    if exc.errno == errno.EEXIST and os.path.isdir(outFolder):
        pass
    else: raise

# download
filename = wget.download(src, out)

Upvotes: 2

Related Questions