Reputation: 11700
I am trying to download the contents of a website using wget tool. I used -R option to reject some file types. but there are some other files which I don't want to download. These files are named as follows, and don't have any extensions.
string-ID
for example:
newsbrief-02
How I can tell wget not to download these files (the files which their names start with specified string)?
Upvotes: 29
Views: 32336
Reputation: 7694
Since (apparently) v1.14 wget
accepts regular expressions : --reject-regex
and --accept-regex
(with --regex-type posix
by default, can be set to pcre
if compiled with libpcre
support).
Beware that it seems you can use --reject-regex
only once per wget
call. That is, you have to use |
in a single regex if you want to select on several regex :
wget --reject-regex 'expr1|expr2|…' http://example.com
Upvotes: 51
Reputation: 64563
You can not specify a regular expression in the wget -R
key, but you can specify a template (like file template in a shell).
The answer looks like:
$ wget -R 'newsbrief-*' ...
You can also use ?
and symbol classes []
.
For more information see info wget.
Upvotes: 11