Reputation: 65
I want to get list of directories from https for aria2c
.
Since, as I know, unlikely to wget
, there is no recurrent option in aria2c
, I going to use the txt file as mentioned here
So I need the list of directories.
This is the target https.
I tried lftp
but there were some cerificate errors.
It would be greatful to let me know how to get the txt file.
Thank you!
Upvotes: 0
Views: 315
Reputation: 959
Try this hacked together script.
function list_folder() {
echo "Starting new run! $1"
content=$(curl -s -L 'https://physionet.org/files/mimic3wdb-matched/1.0/'"$1")
folders=$(echo "$content" | grep -o -P '(?<=">).*(?=/</a>)' | grep -v '\.\.')
# files are all the entries that don't end with a `/`
files=$(echo "$content" | grep -o -P '(?<=">).*[^/](?=<\/a>)')
echo "FOLDERS: $folders"
echo "FILES: $files"
for folder in $folders; do
list_folder "$1/$folder"
done
}
list_folder
It'll recursively search all the files in the directory listing and print them. If you want to save the files into a file, just redirect $files
into the file.
You can also try making it multi threaded by appending a &
to the list_folder
calls.
Upvotes: 1