PowerMan2015
PowerMan2015

Reputation: 1418

Execute Perl Script from Bash script

Ok so i have the following script to scrape contact details from a list of urls (urls.txt). When i run the following command direct from the terminal i get the correct result

perl saxon-lint.pl --html --xpath 'string-join(//div[2]/div[2]/div[1]/div[2]/div[2])' http://url.com 

however when i call the above command from within a script i get a "no such file or directory" result

Here is a copy of my script

#!/bin/bash

while read inputline
do
  //Read the url from urls.txt
  url="$(echo $inputline)"

  //execute saxon-lint to grab the contents of the XPATH from the url within urls.txt
  mydata=$("perl saxon-lint.pl --html --xpath 'string-join(//div[2]/div[2]/div[1]/div[2]/div[2])' $url ")

  //output the result in myfile.csv
  echo "$url,$mydata" >> myfile.csv

  //wait 4 seconds
  sleep 4

//move to the next url
done <urls.txt

i have tried changing the perl to ./ but get the same result

can anyone advise where i am going wrong with this please

The error that i am receiving is

./script2.pl: line 6: ./saxon-lint.pl --html --xpath 'string-join(//div[2]/div[2]/div[1]/div[2]/div[2])' http://find.icaew.com/listings/view/listing_id/20669/avonhurst-chartered-accountants : No such file or directory

Thanks in advance

Upvotes: 0

Views: 5251

Answers (2)

Mark Reed
Mark Reed

Reputation: 95242

You should accept @glennjackman's answer, as that is exactly the problem. This line:

mydata=$("perl saxon-lint.pl --html --xpath 'string-join(//div[2]/div[2]/div[1]/div[2]/div[2])' $url ")

is telling the shell to run this command:

"perl saxon-lint.pl --html --xpath 'string-join(//div[2]/div[2]/div[1]/div[2]/div[2])' $url "

... including the double quotes. If you type that with the double quotes at the shell prompt, you'll get the same "No such file or directory" error message that you're getting from your script.

A couple other notes on the script:

  url="$(echo $inputline)"

This is a roundabout way of making a second variable into a copy of the first. A simple url=$intputline would work as well, but you could also just use read url in the first place. Not sure why you need two variables.

  //output the result in myfile.csv
  echo "$url,$mydata" >> myfile.csv

Be aware that when passing a variable containing user-supplied input as the first argument to echo, you create the possibility of unexpected behavior. In this case, it's a low possibility, since a URL isn't likely to start with a - character, but it's good to get out of the habit; I would use printf. Also, instead of appending each line inside the loop, I would just redirect the output of the loop along with the input:

  printf '%s,%s\n' "$url" "$mydata"
  [...]
done <urls.txt >>myfile.csv

If you don't expect myfile.csv to exist or have anything you need to keep at the top of the loop, you can change that to a single > and avoid the possibility of messy mixtures of output from different runs.

Upvotes: 1

glenn jackman
glenn jackman

Reputation: 246744

Don't put double quotes inside the command substitution.

Not:

mydata=$("perl saxon-lint.pl --html --xpath 'string-join(//div[2]/div[2]/div[1]/div[2]/div[2])' $url ")
# .......^...........................................................................................^

But this:

mydata=$(perl saxon-lint.pl --html --xpath 'string-join(//div[2]/div[2]/div[1]/div[2]/div[2])' $url )

With the double quotes, you're instructing bash to look for a program named "perl saxon-lint.pl --html etc etc" in the path, spaces and all, and clearly no such program exists.

Upvotes: 6

Related Questions