Cynthia GS
Cynthia GS

Reputation: 522

Using NCKS in NCO with large dataset with changing file names

I am trying to extract one variable from large NetCDF files hosted on a FTP server. One option that works well is to download the files one by one with wget, get the variable I want in a new .nc file using ncks and delete the original file. However, the original .nc files are big and it's going to take a while to download them all.

I wanted to use NCO's capability to use a file on an FTP server as the input file, but I'm having trouble finding the appropriate translation of the * in wget cyg05*.nc.

Here is one of the folders I am interested in: ftp://podaac.jpl.nasa.gov/allData/cygnss/preview/L1/v1.1/2017/077/

I only want to track the first 5 characters of the actual file name, and don't care about the rest. I have tried:

for i in `seq 77 1 257`;
    do
        if [ $i -ge 10 ] && [ $i -lt 100 ]; then
            for j in `seq 1 1 8`;
                do
                  ncks -l . ftp://podaac.jpl.nasa.gov/allData/cygnss/preview/L1/v1.1/2017/0$i/cyg0$j'......'.nc 2017_Day_0$i_Spacecraft_0$j.nc
                done
        fi
    done

I have also tried replacing '......' by * and ?????? but without any luck, the file is not recognized. I am aware of the NCO help, in particular, this section: http://nco.sourceforge.net/nco.html#Large-Numbers-of-Files, but I'm not sure I understand the solution described in there.

I am on MacOS 10.11.6.

Upvotes: 0

Views: 727

Answers (1)

Charlie Zender
Charlie Zender

Reputation: 6322

The wildcard characters("*" and "?") work with NCO on local files and through the SSH protocol but not through the FTP protocol. wget is smarter than NCO and can glob files through the FTP protocol. Basically NCO needs to know the full filename to work through the FTP protocol. So probably simplest to use the wget method. Even if NCO globbing worked through FTP, it would still download the entire files, and thus be no faster than wget.

Upvotes: 1

Related Questions