Batman
Batman

Reputation: 8947

Download Files From FTP Server Using Regular Expression

I have an FTP server that hosts data files, where the date that the data is associated with is encoded into the file names. I want to write a process that can find and download all the files associated with a particular date. The complication is that different files use different encodings. (Unfortunately changing/standardising the names isn't an option.) The year can be four digits or two. The month can be two digits or three letters. Sometimes the day is represented, and the substring can be anywhere in the string.

At the moment, I'm creating a list of all the files on the server, then using a regular expression to determine which files are relevant, and then downloading those files.

Is it possible to condense the first two steps? That is, is there a way to get the server to return the list of files that match the expression?

I'm using the Python ftplib if that makes any difference.

Upvotes: 1

Views: 2267

Answers (2)

biorpg
biorpg

Reputation: 11

It should be fairly simple to use LIST, MLSD and NSLT to build a local index of the files on the FTP, and then use regex to filter unwanted files from the index, and then use the remainder in a batch script to download them.

Upvotes: 1

loopbackbee
loopbackbee

Reputation: 23342

The short answer is no, this isn't possible (using FTP).

RFC 5797 Section 3 defines the available commands for FTP clients. The available commands that list files on the remote server are LIST, MLSD and NSLT, and for all of them the only available argument is the name of the directory - there's no way to filter files, by regex or otherwise.

It's not a big overhead to get the listings and parse them in the client, though, unless you're dealing with millions of files

Upvotes: 0

Related Questions