sophie
sophie

Reputation: 115

grabbing all .nc files from URL to get data using matlab

I d like to get all .nc files from URL to get and read the data using matlab. However, the filename is always very long and vary amongst all files. For instance, I have

    url = 'http://sourcename/filename.nc'

the sourcename is always the same, however the filename is very long and vary, so I would like to just use * to be able to grab whatever .nc file in the url

e.g.

    url = 'http://sourcename/*.nc'

but this does not work and I am guessing I need to get the exact name - so I am not sure what to do here?

On the other hand, it could be also interesting for me to get the name of each file and record it, but not sure how to do that either.

Thanks a lot in advance!!

Upvotes: 2

Views: 757

Answers (2)

Ward F.
Ward F.

Reputation: 341

If you have a list of the file names in a text file, you can use the wget utility to process the file and fetch all the listed files. This file would be formatted as follows:

http://url.com/file1.nc
http://url.com/file2.nc
(etc)

You would then invoke wget as follows:

$ wget -i url-file.txt

Alternatively, you may be able to use wget to fetch the files recursively, if they are all located in the same directory on the web server, e.g.:

$ wget -r -l1 http://url.com/directory

The -r flag says to recurse, the -l1 flag says to go no deeper than 1 level when recursing.

This solution is external to Matlab, but once you have all of the files downloaded, you can work with them all locally.

wget is a fairly standard utility available on linux systems. It is also available for OSX and Windows as well. The wget homepage is here: https://www.gnu.org/software/wget/

Upvotes: 1

Peter
Peter

Reputation: 14927

HTTP does not implement a filesystem abstraction. This means that each of those URLs that you request could be handled in a completely different way. There is also in many cases no way to get a list of allowable URLs off of a parent (a directory listing, in other words).

It may be the case for you that http://sourcename/ actually returns an index document containing a list of the files. In that case, first fetch that document. Then you'll have to parse the contents to extract the list of files. Then you can loop over those files, form new URLs for each one, and fetch them in sequence.

Upvotes: 1

Related Questions