Reputation: 131
I want to read some CSV/Excel files on a secure SFTP folder, make some changes (fixed changes in each file like remove column 2) in those files, upload them to a Postgre DB and also the upload them to a different SFTP path in Python
What's the best way to this?
I have made a connection to the SFTP using pysftp library and am reading the Excel:
import pysftp
import pandas as pd
myHostname = "*****"
myUsername = "****"
myPassword = "***8"
cnopts =pysftp.CnOpts()
cnopts.hostkeys = None
sftp=pysftp.Connection(host=myHostname, username=myUsername,
password=myPassword,cnopts=cnopts)
print ("Connection succesfully stablished ... ")
sftp.chdir('test/test')
#sftp.pwd
a=[]
for i in sftp.listdir_attr():
with sftp.open(i.filename) as f:
df=pd.read_csv(f)
How should I proceed with the upload to DB and making those changes to the CSV permanent?
Upvotes: 2
Views: 9660
Reputation: 202360
You have the download part done.
For the upload part, see How to Transfer Pandas DataFrame to .csv on SFTP using Paramiko Library in Python? – While it's for Paramiko, pysftp Connection.open
method behaves identically to Paramiko SFTPClient.open
, so the code is the same (though, you should not use pysftp).
Full code can be like:
with sftp.open("/remote/path/data.csv", "r+", bufsize=32768) as f:
# Download CSV contents from SFTP to memory
df = pd.read_csv(f)
# Modify as you need (just an example)
df.at[0, 'Name'] = 'changed'
# Upload the in-memory data back to SFTP
f.seek(0)
df.to_csv(f, index=False)
# Truncate the remote file in case the new version of the contents is smaller
f.truncate(f.tell())
The above updates the same file. If you want to upload to a different file, use this:
# Download CSV contents from SFTP to memory
with sftp.open("/remote/path/source.csv", "r") as f:
df = pd.read_csv(f)
# Modify as you need (just an example)
df.at[0, 'Name'] = 'changed'
# Upload the in-memory data back to SFTP
with sftp.open("/remote/path/target.csv", "w", bufsize=32768) as f:
df.to_csv(f, index=False)
For the purpose of bufsize
, see:
Writing to a file on SFTP server opened using Paramiko/pysftp "open" method is slow
Obligatory warning: Do not set cnopts.hostkeys = None
, unless you do not care about security. For the correct solution see Verify host key with pysftp.
Upvotes: 5
Reputation: 103
That's several questions in one question :)
I would suggest go with that approach:
pandas.io
or pure SQLAlchemy
. Check the docs hereUpvotes: 0