nanthil
nanthil

Reputation: 65

Why is my shell command working at the prompt, but not as a bash script?

New to bash scripting. I'm getting pretty familiar with shell scripting pretty well. I wrote this text transform script for a feed for a client. And extracts the url's I want, and the titles of articles. Awesome.

echo $(var=$(curl -L website.com/news)) | 
  grep -Po '<h3 class="article-link"><a href="\K[^<]+' <<< $var |
    result=$(sed 's/"/\n/g' | sed 's/ \//\n\//g' | sed 's/>//g') ; let this=0 ; echo "$result" | while read line ; do if ((this % 2 == 0 )) ; then echo website.com/news$line ; else echo $line ; fi ; let this+=1 ; done

When I try to extract it to a file and run it with bash OR sh myThing.sh, it doesn't work at all. The only thing that echo's is 'webiste.com/news', when I try to echo $this, all I get is 1. What am I doing wrong?

#!/bin/bash   
echo $(var=$(curl -L website.com/news)) | 
  grep -Po '<h3 class="article-link"><a href="\K[^<]+' <<< $var |
    result=$(sed 's/"/\n/g' | sed 's/ \//\n\//g' | sed 's/>//g') 

 let this=0 

 echo "$result" | while read line 
 do 
    if ((this % 2 == 0 )) 
    then 
        echo website.com/news$line 
    else 
        echo $line 
    fi 
    let this+=1 
done

edit:

#!/bin/bash

var=$(curl -L linux.com/news) 
select=$(grep -Po '<h3 class="article-list__title"><a href="\K[^<]+' <<< $var)
result=$(sed 's/"/\n/g' | sed 's/ \//\n\//g' | sed 's/>//g') 

 let this=0 

 echo "$result" | while read line 
 do 
    if ((this % 2 == 0 )) 
    then 
        echo website.com/news$line 
    else 
        echo $line 
    fi 
    let this+=1 
done

Upvotes: 0

Views: 1288

Answers (2)

alvits
alvits

Reputation: 6758

This line is totally wrong. You are attempting to pass thru pipes the standard output of each process when none of them ever prints anything except standard error.

echo $(var=$(curl -L website.com/news)) | grep -Po '<h3 class="article-link"><a href="\K[^<]+' <<< $var | result=$(sed 's/"/\n/g' | sed 's/ \//\n\//g' | sed 's/>//g') 

I'll break down what I believe you are attempting to do.

echo $(var=$(curl -: website.com/news))

The above code will only print the standard error, which is a separate stream than standard output. The standard output is assigned to $var. However you are attempting to pass the standard output to the next process which is nothing but a newline at this time.

grep -Po '<h3 class="article-link"><a href="\K[^<]+' <<< $var

The here-string <<< takes precedence over pipe. But variable $var is lost as it was defined inside a sub-shell and not in the parent shell. Thanks to @mklement0.

The proper way to accomplish all this is to not use $var. All you wanted is the value stored in $result.

result=$(curl -L website.com/news | grep -Po '<h3 class="article-link"><a href="\K[^<]+'| sed 's/"/\n/g' | sed 's/ \//\n\//g' | sed 's/>//g')

I don't intend to optimize your script. This is more of a suggested solution. A more comprehensive answer to your question Why is my shell command working at the prompt, but not as a bash script? is answered by mklement0 here.

Upvotes: 1

mklement0
mklement0

Reputation: 437823

This answer solves the OP's specific problem, but to address the question "Why is my shell command working at the prompt, but not as a bash script?" generally, Etan Reisner provides an excellent answer in the comments:
"You are either not running that exact command or it "works" because you have shell state that is affecting things in ways you take to be "working" and your script doesn't have that state. Try launching an entirely new shell session and see if that command, on its own, works for you there."

echo $(var=...) will assign a value to variable $var, but will not output anything, so the echo command will simply print a newline.

Furthermore, because the assignment to $var happens inside $(...) (a command substitution), it is confined to the subshell that the command inside the substitution ran in, so $var will not be defined in the calling shell.
(A subshell is a child process that contains a duplicate of the current shell's environment, without being able to modify the current shell's environment).

More generally, you cannot meaningfully define variables inside a pipeline - they will neither be visible to other pipeline segments, nor after the pipeline finishes.[1]

The only reason your [original] command could ever have worked is if $var had a preexisting value in your shell. In fact, given that you provide input to grep via a here-string (<<<), the first segment of your pipeline (echo ...) is entirely ignored.

To pass the output of curl through the pipeline to grep and then to sed, no intermediate variables are needed at all.
Furthermore, your sed command is lacking input: you probably meant to feed it $var in your first attempt, and $select in the 2nd (your 2nd attempt came close to a correct solution).

What you were probably ultimately looking for:

result=$(curl -L website.com/news | 
  grep -Po '<h3 class="article-link"><a href="\K[^<]+' |
    sed 's/"/\n/g' | sed 's/ \//\n\//g' | sed 's/>//g')

# ... processing of "$result"

Some additional notes:

  • You could combine the 3 sed calls into a single one.
  • You could feed the pipeline output directly into your while loop, without the need for intermediate variable $result.
  • You should generally double-quote variable references (e.g., use "$line" instead of $line to protect them from interpretation by the shell (word-splitting, globbing).
  • let this+=1 is better expressed as (( ++this )) in modern Bash.
  • This answer of mine contains links to resources for learning about bash.

[1] All commands involved in a pipeline by default run in a subshell in bash, so they all see copies of the parent shell's variables. Bash 4.2+ offers the lastpipe option (off by default) to allow you to create variables in the current shell instead of in a subshell, by running the last pipeline segment (only) in the current shell instead of in a subshell, to facilitate scenarios such as ... | while read -r line ... and have $line continue to exist after the pipeline finishes.
Note that this still doesn't enable defining a variable in an earlier pipeline segment in the hopes that a later segment will see it - this can never work, because the commands that make up a pipeline are launched at the same time, and it is only through coordination of the input and output streams that effective left-to-right processing happens.

Upvotes: 3

Related Questions