JohnL_10
JohnL_10

Reputation: 569

Python What is going on when I get or attempt to read an sftp file?

A question here that concerns how stfp operates. I am curious to understand what exactly is going on "behind the scenes" when I connect to a SFTP and download a file.

I've been using some inherited code that connects to a sftp server, and lists a directory: e.g.

sftp = pysftp.Connection(host='xxxxx', username='xxxx', password='xxxxx', cnopts=cnopts)
sftp.cwd('/some_dir')

server_files = sftp.listdir()

From there, let's say I want to read server_files[1] into a pandas dataframe:

I cannot simply attempt to read the file I wish, i.e.

pd.read_csv(server_files[1])
# FileNotFoundError: [Errno 2] No such file or directory: 

Instead; I must first get the file like the below.

sftp.get(server_files[1])
pd.read_csv(server_files[1])
# Success!

That works great. No issue there. My questions are, however:

Where in working memory is the server_files[1] object "stored"? I have executed type(server_file[1]) both before and after the get operation and both results are str, indicating to me that the object itself has not changed. So I don't really understand where the data for that file is being held. Why do I not need to do something like my file = sftp.get(server_file[1])? How can it work on just the 'name' of the file?

I appreciate this less a question of how to make the code work, but rather how it is operating, but nonetheless I would appreciate some help in understanding this.

Cheers

John

Upvotes: 1

Views: 1032

Answers (1)

Martin Prikryl
Martin Prikryl

Reputation: 202494

Looks like a misunderstanding.

The pysftp.get(server_files[1]), downloads the remote file server_files[1] to a local file server_files[1] in the current working directory, as you do not specify the localpath (second) optional argument.

Quoting pysftp.get documentation (emphasis mine):

localpath (str) – the local path and filename to copy, destination. If not specified, file is copied to local current working directory


So the pd.read_csv(server_files[1]) reads the local copy. It does not magically read the remote file.

The server_files[1] is indeed a simple string containing a file name. Nothing fancy.

Upvotes: 2

Related Questions