Reputation: 41
I have files within directories that needs to be transferred from one SFTP server to another. Common ways I have found are to download the files from server 1 and then transfer them to server 2. My goal is to automate the transfer using serverless GCP cloud functions with Python. Any suggestions will be helpful.
I have tried using the popular Paramiko
library, but it fails to authenticate. Python's asyncssh
library seems to connect to the servers. But I cannot figure out a way to transfer the files with the same directory format.
Edit: Here is the current implementation. Where we establish a connection with the SFTP server, downloads the files to GCP Cloud storage.
import asyncio
import asyncssh
import stat
from flask import jsonify, make_response
import functions_framework
from google.cloud import storage
async def download_files(server_url, username, private_key_path, remote_path, bucket_name):
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
downloaded_files = []
async with asyncssh.connect(
server_url, username=username, client_keys=[private_key_path], known_hosts=None
) as conn:
async with conn.start_sftp_client() as sftp:
await recursive_download(sftp, remote_path, bucket, downloaded_files)
return downloaded_files
async def recursive_download(sftp, remote_path, bucket, downloaded_files):
for filename in await sftp.listdir(remote_path):
remote_file = remote_path + "/" + filename
attrs = await sftp.stat(remote_file)
if stat.S_ISREG(attrs.permissions):
async with sftp.open(remote_file) as remote_file_obj:
file_data = await remote_file_obj.read()
blob = bucket.blob(remote_file)
blob.upload_from_string(file_data)
print(f"Downloaded {remote_file}")
downloaded_files.append(remote_file)
elif stat.S_ISDIR(attrs.permissions):
await recursive_download(sftp, remote_file, bucket, downloaded_files)
@functions_framework.http
def main(request):
try:
server_url = "your_server_url"
username = "your_username"
private_key_path = "your_private_key_path"
remote_path = "your_remote_path"
bucket_name = "your_bucket_name"
downloaded_files = asyncio.run(download_files(server_url, username, private_key_path, remote_path, bucket_name))
return make_response(jsonify({"message": f"Files downloaded successfully. Total files: {len(downloaded_files)}. Files: {downloaded_files}"}), 200)
except Exception as e:
return make_response(jsonify({"error": str(e)}), 500)
The code works but feels very complicated for the logic I want to implement next. Such as download the files and sub-directories of the specified remote path.
Upvotes: 1
Views: 389