Reputation: 381
I'm trying to write a bash script that will download the contents of a URL (not recursive) and then analyze the file that was downloaded.
If the downloaded file is a text file (i.e index.html) I want to know the size of the file and count the number of characters within that file.
If the file is an image file I just want to know the file size.
Right now I'm working with wget
and downloading the contents of the input URL, but the problem is that when I do this inside my script I don't know the file name of the file that was downloaded.
wget
to perform some analyzing operations on the file?Upvotes: 1
Views: 565
Reputation: 381
I did finally manage to solve it.
#!usr/bin/env bash
URL="$1"
FILENAME=$(date +%y-%m-%d-%T) #Set the current date and time as the filename
wget -O "$FILENAME" "$URL" #Download the content from the URL and set the filename
FILE_INFO=$(file "$FILENAME") #Store the output from the 'file' command
if [[ "$FILE_INFO" == *"text"* ]]
then
echo "It's a text file"
elif [[ "$FILE_INFO" == *"image"* ]]
then
echo "It's an image"
fi
Special thanks to Ben Scott for the help!
Upvotes: 2
Reputation: 318
I would suggest setting the file name wget
will write to, using the -O
switch. One can then generate a file name, tell wget
to download the URL to that file name, and run whatever analysis tools one wants, using the file name you picked.
The idea here is, you not have to figure out what name the web site or URL or wget
will pick -- you are controlling the parameters. That is a useful programming technique in general. The less the user or some external program or website can provide for input, the more robust and simpler your program code will be.
As for picking a file name, you could use a timestamp. The date
utility can generate a timestamp for you, if you give it a +FORMAT
parameter. Alternatively, since you mention this is part of an analysis tool, maybe you don't want to save the file at all. In that case, try a tool like mktemp
to generate a guaranteed unique file name, and then remove it before exiting.
For more information, see the manual pages wget(1)
, date(1)
, and mktemp(1)
.
Not giving complete working code, in case anyone ever gets this as school assignment, and they stumble across this question. I wouldn't want to make it too easy for that hypothetical person. ;-) Of course, if someone asked more specific questions, I'd likely clarify my answer for them.
Upvotes: 1