badaboum
badaboum

Reputation: 31

wget command to download web-page & rename file with with html title?

I would like to download an html web-page and have the filename be the title of the html page.

I have found a command to get the html title:

wget -qO- 'https://www.linuxinsider.com/story/Austrumi-Linux-Has-Great-Potential-if-You-Speak-Its-Language-86285.html/' |   gawk -v IGNORECASE=1 -v RS='</title' 'RT{gsub(/.*<title[^>]*>/,"");print;exit}'

And it prints this: Austrumi Linux Has Great Potential if You Speak Its Language | Reviews | LinuxInsider

Found on: https://unix.stackexchange.com/questions/103252/how-do-i-get-a-websites-title-using-command-line

How could i pipe the title back into wget to use it as the filename when downloading that web-page?

EDIT: in case there is no way to do this directly in wget, i found a way to simply rename the html files once downloaded

Renaming HTML files using <title> tags

Upvotes: 0

Views: 972

Answers (1)

Ed Morton
Ed Morton

Reputation: 204311

You can't wget a file, analyze it's contents and then make the same wget execution that downloaded the file magically go back in time and output it to a new file named after it's contents that you analyzed in step 2. Just do this:

wget  '...' > tmp &&
name=$(gawk '...' tmp) &&
mv tmp "$name"

Add protection against / in name as necessary.

Upvotes: 1

Related Questions