frmzi
frmzi

Reputation: 11

Download file from FTP server using a SOCKS 5 proxy in Python

I need to extract a file from an FTP server using Python. I have tried to use ftplib, but it does not seem to work when using a proxy. I have also tried using the ftp-proxy-client package it does not seem to work either.

I have used these credentials.

ftp_adress = "XXX.XXX.XXX.XX"
ftp_user = "FTP_user"
ftp_password = "FTP_password"
ftp_port = "XXX"
ftp_path = "path"
file_name = "file_name.xlsx"
local_path = "local_path/file_name.xlsx"
proxy = "proxy_name.com"
proxy_port = "XXXX"

The code I have tried is:

from ftp_proxy_client import FtpProxy

ftp_proxy = FtpProxy(host=proxy, port=proxy_port)

ftp_client = ftp_proxy.connect('ftp_adress', 
                               port=ftp_port, 
                               login=ftp_user, 
                               password=ftp_password)

fp = ftp_client.download(path=f"{ftp_path}/{file_name}")
with open(local_path, 'wb') as file:
    file.write(fp.read())

But I end up with this error:

FtpProxyError: Failed connecting to proxy server

Upvotes: 1

Views: 2133

Answers (2)

Matth
Matth

Reputation: 148

Martin's answer works, but it will result in all your future requests using the proxy, since it monkey patches the underlying socket library.

That didn't work for us so we created a subclass of ftplib.FTP that creates proxied sockets:

import ftplib
import socks

import sys

# HERE BE DRAGONS
# the built in ftp library for python does not have proxy functionality
# it also does not have an easy way to extend it so it does
# what fallows is our attempt to add proxy abilities in the "least bad" way
#
# The code inside these methods is copied from the original source code with
# the minor changes marked inline. These changes create a proxied socket using
# PySock and uses that for the ftp connection


class ProxyFTP(ftplib.FTP):
    # call this method to set the proxy settings to use for this FTP object
    def set_proxy(
        self, proxy_type, proxy_addr, proxy_port, proxy_username, proxy_password
    ):
        self.proxy_type = proxy_type
        self.proxy_addr = proxy_addr
        self.proxy_port = proxy_port
        self.proxy_username = proxy_username
        self.proxy_password = proxy_password

    # creates the connection via our proxy settings
    def _proxy_create_connection(self, *args, **kwargs):
        return socks.create_connection(
            *args,
            proxy_type=self.proxy_type,
            proxy_addr=self.proxy_addr,
            proxy_port=self.proxy_port,
            proxy_username=self.proxy_username,
            proxy_password=self.proxy_password,
            **kwargs
        )

    # taken from
    # https://github.com/python/cpython/blob/890c3be8fb10bc329de06fa9d3b18dd8d90aa8b5/Lib/ftplib.py#L139
    def connect(self, host="", port=0, timeout=-999, source_address=None):
        """Connect to host.  Arguments are:
        - host: hostname to connect to (string, default previous host)
        - port: port to connect to (integer, default previous port)
        - timeout: the timeout to set against the ftp socket(s)
        - source_address: a 2-tuple (host, port) for the socket to bind
          to as its source address before connecting.
        """
        if host != "":
            self.host = host
        if port > 0:
            self.port = port
        if timeout != -999:
            self.timeout = timeout
        if self.timeout is not None and not self.timeout:
            raise ValueError("Non-blocking socket (timeout=0) is not supported")
        if source_address is not None:
            self.source_address = source_address
        sys.audit("ftplib.connect", self, self.host, self.port)
        ### Originally:
        # self.sock = socket.create_connection((self.host, self.port), self.timeout,
        #                                      source_address=self.source_address)
        ### Changed to:
        self.sock = self._proxy_create_connection(
            (self.host, self.port), self.timeout, source_address=self.source_address
        )
        self.af = self.sock.family
        self.file = self.sock.makefile("r", encoding=self.encoding)
        self.welcome = self.getresp()
        return self.welcome

    # taken from
    # https://github.com/python/cpython/blob/890c3be8fb10bc329de06fa9d3b18dd8d90aa8b5/Lib/ftplib.py#L336
    def ntransfercmd(self, cmd, rest=None):
        """Initiate a transfer over the data connection.
        If the transfer is active, send a port command and the
        transfer command, and accept the connection.  If the server is
        passive, send a pasv command, connect to it, and start the
        transfer command.  Either way, return the socket for the
        connection and the expected size of the transfer.  The
        expected size may be None if it could not be determined.
        Optional `rest' argument can be a string that is sent as the
        argument to a REST command.  This is essentially a server
        marker used to tell the server to skip over any data up to the
        given marker.
        """
        size = None
        if self.passiveserver:
            host, port = self.makepasv()
            ### Originally:
            # conn = socket.create_connection((host, port), self.timeout,
            #                                 source_address=self.source_address)
            ### Changed to:
            conn = self._proxy_create_connection(
                (host, port), self.timeout, source_address=self.source_address
            )
            try:
                if rest is not None:
                    self.sendcmd("REST %s" % rest)
                resp = self.sendcmd(cmd)
                # Some servers apparently send a 200 reply to
                # a LIST or STOR command, before the 150 reply
                # (and way before the 226 reply). This seems to
                # be in violation of the protocol (which only allows
                # 1xx or error messages for LIST), so we just discard
                # this response.
                if resp[0] == "2":
                    resp = self.getresp()
                if resp[0] != "1":
                    raise ftplib.error_reply(resp)
            except:
                conn.close()
                raise
        else:
            with self.makeport() as sock:
                if rest is not None:
                    self.sendcmd("REST %s" % rest)
                resp = self.sendcmd(cmd)
                # See above.
                if resp[0] == "2":
                    resp = self.getresp()
                if resp[0] != "1":
                    raise ftplib.error_reply(resp)
                conn, sockaddr = sock.accept()
                if self.timeout is not ftplib._GLOBAL_DEFAULT_TIMEOUT:
                    conn.settimeout(self.timeout)
        if resp[:3] == "150":
            # this is conditional in case we received a 125
            size = ftplib.parse150(resp)
        return conn, size

Hopefully this saves others time and headache

Upvotes: 3

Martin Prikryl
Martin Prikryl

Reputation: 202474

Install PySocks and use its set_default_proxy:

socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, ip, port)

From there, you use a plain ftplib code, as if you were connecting to the the FTP server directly.

See How to make python Requests work via socks proxy.

Upvotes: 3

Related Questions