Reputation: 1683
I need to download only rpm files from different directories. Here is my code -
#!/usr/bin/env bash
# Download Only rpm file from certain directory.
# wget
# -4 = only ipv4
# -A = accept list
# -r = reccursively
# -R = reject list
# -c = continue
# -e = execute command
# --exclude-directories = take list
# Create a directory
mkdir mrepo
# Enter into the directory
cd mrepo
# RPM URL
repo_url="http://download.virtualbox.org/virtualbox/"
# Repo rpm
repo_download=('5.2.20' '5.2.22' '6.0.0')
# Exclude directories
exclude_dir=('*_Beta')
# Download all rpm packages
for i in "${repo_download[@]}"; do
echo $i/
echo ${repo_url}/$i/
# wget -A rpm -rc -e robots=off --reject "index.html*" ${repo_url}/$i/
wget -A zip -rc -e robots=off --reject "index.html*" ${repo_url}/$i/
done
# Tar the downloaded rpm
tar -cvzf missingrepo.tgz --exclude=./*.sh .
My target is to
for
loop. Apparently, it seems working. But actually not. wget
command to all directories and starts downloading rpm file from all directories. :-( Tried to use --exclude-directories=
to exclude unnecessary sub directories. But --exclude-directories
argument is not working.
PS: To execute rapidly and test purpose, I use zip files to download.
wget -A zip -rc -e robots=off --reject "index.html*" --exclude-directories=exclude_dir ${repo_url}/$i/
Any help would be really helpful !!
Upvotes: 0
Views: 852
Reputation: 4148
Use the -np|--no-parent
and -l|--level
command line option of wget
.
Do not ever ascend to the parent directory when retrieving recursively. This is a useful option, since it guarantees that only the files below a certain hierarchy will be downloaded.
Specify recursion maximum depth level. If you want to download all the files from one directory, use ‘-l 1’ to make sure the recursion depth never exceeds one.
So the command should look like this wget -A zip -np -r -l 1 -c -e robots=off --reject "index.html*" ${repo_url}/${i}/
. --reject "index.html*"
is useless in my mind. And you should correct repo_url
in your script to "http://download.virtualbox.org/virtualbox"
without the trailing slash. So you get
wget -A zip -np -r -l 1 -c -e robots=off ${repo_url}/${i}/
The result is then
mrepo/download.virtualbox.org/virtualbox/5.2.20/VirtualBoxSDK-5.2.20-125813.zip
mrepo/download.virtualbox.org/virtualbox/5.2.22/VirtualBoxSDK-5.2.22-126460.zip
mrepo/download.virtualbox.org/virtualbox/6.0.0/VirtualBoxSDK-6.0.0-127566.zip
Just to be complete, a short version of the script is as follows:
#!/usr/bin/env bash
repo_url="https://download.virtualbox.org/virtualbox"
repo_download=('5.2.20' '5.2.22' '6.0.0')
for i in "${repo_download[@]}"; do
wget -A zip -np -r -l 1 -c -e robots=off ${repo_url}/${i}/
done
Upvotes: 1