Reputation: 43
I have access to an SFTP server from GCP containing several folders with a number of .csv files. A few times every day, a new .csv file is uploaded to some of these folders on the server. I have whitelisted a static IP address I have reserved on GCP Compute Engine that I can use to SSH into the SFTP server.
What I would like to do is to schedule an hourly job to copy and synchronize the contents of the folders of the SFTP server and that of a Cloud Storage Bucket named 'sftp-data' with the same folder structure.
What I had in mind was to use Cloud Scheduler to schedule the job and use my created Compute Engine instance with the reserved static IP address to synchronize the file contents of the two directories, and was wondering if this is possible to implement on GCP. Would I need to involve a Cloud Function for some reason? Hope to receive some more practical guidance on how this job can be automated.
Many thanks in advance!
Upvotes: 0
Views: 1500
Reputation: 2055
Currently, Google Cloud Platform doesn't have a dedicated product to move files to or from Google Cloud Storage or GCS using SFTP.
There are several products that you can use in transferring objects from SFTP to GCS. An example of which is sftp-gcs, written in node.js, and has been tested in several runtimes including running as a container. However, this current implementation only supports a single target bucket.
Another solution is using SFTP Gateway to transfer files to Google Cloud Storage. This is, however, a paid service but it would only cost six cents an hour, plus infrastructure charges. One of its features is including a web interface and REST API for simple user management, folder permissions, and instance administration whether you're supporting a single user or thousands.
Another great feature is automating file transfer process, greatly saving time for the team.
Would I need to involve a Cloud Function for some reason?
No need to involve a Cloud Functions or Cloud Run as they cannot be used to run this solution for being serverless and they don't support other protocols other than HTTPS. SFTP is using SSH protocol carrying SFTP sub protocol requests.
You can also schedule a Compute Engine instance to start or stop by checking out this documentation. This includes creating and deploying functions and setting up jobs to call pub/sub.
You can also check this documentation on SFTP access to Google Cloud Storage for more information.
Upvotes: 0