Jake
Jake

Reputation: 65

Get list of files from https and save as txt file

I want to get list of directories from https for aria2c.

Since, as I know, unlikely to wget, there is no recurrent option in aria2c, I going to use the txt file as mentioned here

So I need the list of directories.

This is the target https.

I tried lftp but there were some cerificate errors.

It would be greatful to let me know how to get the txt file.
Thank you!

Upvotes: 0

Views: 315

Answers (1)

SIMULATAN
SIMULATAN

Reputation: 959

Try this hacked together script.

function list_folder() {
    echo "Starting new run! $1"
    content=$(curl -s -L 'https://physionet.org/files/mimic3wdb-matched/1.0/'"$1")
    folders=$(echo "$content" | grep -o -P '(?<=">).*(?=/</a>)' | grep -v '\.\.')
    # files are all the entries that don't end with a `/`
    files=$(echo "$content" | grep -o -P '(?<=">).*[^/](?=<\/a>)')
    echo "FOLDERS: $folders"
    echo "FILES: $files"
    for folder in $folders; do
        list_folder "$1/$folder"
    done
}

list_folder

It'll recursively search all the files in the directory listing and print them. If you want to save the files into a file, just redirect $files into the file.

You can also try making it multi threaded by appending a & to the list_folder calls.

Upvotes: 1

Related Questions