Reputation: 379
I am new to shell scripting. I am trying to write a script that reads URLs from a text file line by line and then fetches them using wget
. Also I need to parse the log file for error messages.
#!/bin/sh
# SCRIPT: example.sh
#reading the url file line by line
DIR = /var/www/html/
# wget log file
LOGFILE = wget.log
# wget output file
FILE = dailyinfo.`date +"%Y%m%d"`
cd $DIR
FILENAME = url.txt
cat $FILENAME | while read LINE
do
echo "$LINE"
wget $LINE -O $FILE -o $LOGFILE
done
I have changed the permissions using chmod +x example.sh
but upon execution I get command not found
error for DIR
, FILE
and LOGFILE
.
How to correct it?
Also how to go about the parsing part?
Upvotes: 1
Views: 28329
Reputation: 37047
I just encountered the same error message with tcsh : Command not found.
Oddly enough it was caused by the line endings. The same exact script works find with LF endings, but fails with CRLF endings.
Upvotes: 0
Reputation: 9385
Petesh is of course correct, you need to put the =
sign straight after your variable name.
For this particular case, I suggest you use wget -i input-urls.txt -o logfile.txt
, and then grep the logfile for errors. wget's -i
flag reads a list of URLs from a text file, and "wgets" each of them, saving you re-inventing the wheel.
If you want it in a shell script, use something like:
#!/bin/sh
DIR=/var/www/html/
# wget log file
LOGFILE=wget.log
# wget output file
FILE=dailyinfo.`date +"%Y%m%d"`
# just for debugging
cd $DIR
echo "wget-ing urls from $FILE and writing them to $FILE in $DIR. Saving logs to $LOGFILE"
wget -i $FILE -o $LOGFILE
grep -i 'failed' logfile.txt
Here's an example error from the logfile:
--2013-01-15 15:01:59-- http://foo/
Resolving foo... failed: nodename nor servname provided, or not known.
wget: unable to resolve host address ‘foo’
It's also useful to check the return code of wget. 0
indicates success, and non-zero values indicate various failures. You can check them by accessing the shell variable $?
.
So, incorporating that, here's a sample script:
#!/bin/sh
wget -i input-urls.txt -o logfile.txt
if [ $? -eq 0 ]; then
echo "All good!"
else
# handle failure
grep -i 'failed' logfile.txt
fi
The return codes of wget are listed on the man page (man wget
, or use an online resource like this one) if you need more detail. I gave it a quick experiment, and it looks like wget returns a non-zero exit code even if just one of the URLs triggers a failure.
Upvotes: 2
Reputation: 94654
problem #1
, when assigning variables you must use the syntax:
VARIABLE=value
i.e. no space between the VARIABLE
the =
and the new value.
otherwise, it tries to execute VARIABLE
as a command, which triggers the command not found
error.
#!/bin/sh
# SCRIPT: example.sh
#reading the url file line by line
DIR=/var/www/html/
# wget log file
LOGFILE=wget.log
# wget output file
FILE=dailyinfo.`date +"%Y%m%d"`
cd $DIR
FILENAME=url.txt
cat $FILENAME | while read LINE
do
echo "$LINE"
wget $LINE -O $FILE -o $LOGFILE
done
will probably get past the command not found errors
Upvotes: 7