Reputation: 9428
I have a site (http://a-site.com) with many links like that. How can I use wget to crawl and grep all similar links to a file?
<a href="/user/333333/follow_user" class="btn" rel="nofollow">Follow</a>
I tried this but this command only get me all similar links on one page but not recursively follow other links to find similar links.
$ wget -erobots=off --no-verbose -r --quiet -O - http://a-site.com 2>&1 | \
grep -o '['"'"'"][^"'"'"']*/follow_user['"'"'"]'
Upvotes: 0
Views: 133
Reputation: 7704
You may want to use the --accept-regex
option of wget
rather than piping through grep
:
wget -r --accept-regex '['"'"'"][^"'"'"']*/follow_user['"'"'"]' http://a-site.com
(not tested, the regex may need adjustment or specification of --regex-type
(see man wget
), and of course add other options you find useful).
Upvotes: 1