chenoi
chenoi

Reputation: 565

Download the latest file according to timestamp in file name from SFTP server

I'm trying to get latest new file in a directory of remote Linux server. The file in SFTP server is created every 4 hours and the file have specific name start with filegen_date_hour.json as per example below. In this case latest file 'filegen_20200101_0800.json' need to be transferred to my local directory.

filegen_20200101_0000.json
filegen_20200101_0400.json
filegen_20200101_0800.json

I use Python 3 code below, but got error

latestFile = max(listFile, key=os.path.getctime)
ValueError: max() arg is an empty sequence

SFTP code below

myHostname = "192.168.100.10"
myUsername = "user"
myPassword = "password"

cnopts = pysftp.CnOpts()
cnopts.hostkeys = None

with pysftp.Connection(host=myHostname, username=myUsername, password=myPassword, cnopts=cnopts) as sftp:
    with sftp.cd('/home/operation/genfiles/'):             
        fileDir = '/home/operation/genfiles/filegen_*.json' 
        **#file have specific pattern with filegen_*.json**
        listFile = glob.glob(fileDir)
        latestFile = max(listFile, key=os.path.getctime)
        sftp.get(latestFile)         

Appreciate help on this matter. Thank you for your response and help.

Upvotes: 2

Views: 3850

Answers (1)

Martin Prikryl
Martin Prikryl

Reputation: 202272

First, you cannot use glob to list files on an SFTP server. The glob won't magically start querying SFTP server only because you have opened an SFTP connection before. It will still query local file system.

Use pysftp Connection.listdir. Though it does not support wildcards, so you will have to filter the files you want locally. Like here:
List files on SFTP server matching wildcard in Python using Paramiko


Only then you can try finding the latest file. In general, you may use file modification time, as here:
How to download only the latest file from SFTP server with Paramiko?
The code is for Paramiko SFTPClient.listdir_attr, but it's the same with pysftp Connection.listdir_attr.

But in your case, I'm not sure if you can rely on the modification timestamp. It seems that you actually want to use the timestamp in the filename. With your file name format, you can simply pick the last file lexicographically.

import fnmatch

...

with sftp.cd('/home/operation/genfiles'):             
    files = []
    for filename in sftp.listdir():
        if fnmatch.fnmatch(filename, "filegen_*.json"):
            files.append(filename)
    latestFile = max(files)

Obligatory warning: Do not set cnopts.hostkeys = None, unless you do not care about security. For the correct solution see Verify host key with pysftp.

Upvotes: 1

Related Questions