user3398630
user3398630

Reputation: 21

Special characters in variable to curl in a bash script

Dear highly appreciated community,

At first let me say thank you for years of valuable lecture and learning potantial. I always got an answer on my questions by survey. Unfortunately, I didn't find any clue this time.

I am writing, what I thought, a small and easy script to just download several websites from a .csv file.

The file is structured as followed:

[email protected];http://www.url.com/?s=NUMBER&a=NUMBER&l=COUNTRY&c=NUMBER&h=NUMBER

where NUMBER is a number and country is the 2 digits countrycode. "uk" or "fr", for example.

The URL alwas has the same beginning http://www.URL.com/?s= followed by 4 settings.

I thought of being satisfied by just downloading those hundreds websites as is. Because they do not contain any special images.

My script looks like this:

#!/bin/bash
while read line
do
    #echo $line
    #curl -o download/test.htm $line
    varA="$( echo $line|awk -F';' '{print $1}' )"
    varB="$( echo $line|awk -F';' '{print $2}' )"
    varB1="$( echo $varB|awk -F'&' '{print $2}' )"
    varB2="$( echo $varB|awk -F'&' '{print $3}' )"
    varB3="$( echo $varB|awk -F'&' '{print $4}' )"
    varB4="$( echo $varB|awk -F'&' '{print $5}' )"
    echo 'Downloading survey of:'
    echo $varA
    curl -o $varA.htm "http://www.url.com/?s=771223&"$varB1"&"$varB2"&"$varB3"&"$varB4
    echo "--------------------------------------------------------------"
    echo ""
done < Survey.csv

The website downloaded always contains a http 400 Error.

I already tried curl -o $varA.htm $varB which also returned the http 400 Error.

Thinking the '&' was the culprit, the script you see above is my last try.

Many thanks in advance! Andre

Upvotes: 2

Views: 608

Answers (2)

Scrutinizer
Scrutinizer

Reputation: 9926

Similar to the remarks by @chepner, try something like:

while IFS=';?&' read varA varB0 varB1 varB2 varB3 varB4
do
  echo 'Downloading survey of:'
  echo "$varA"
  curl -o "$varA.htm" "http://www.url.com/?s=771223&${varB1}&${varB2}&${varB3}&${varB4}"
done < Survey.csv

or in this case where the last 4 variables are used unchanged:

while IFS=';?&' read varA varB0 rest
do
  echo 'Downloading survey of:'
  echo "$varA"
  curl -o "$varA.htm" "http://www.url.com/?s=771223&$rest"
done < Survey.csv

Upvotes: 2

anubhava
anubhava

Reputation: 784998

Rather than using multiple awk you can do in single awk:

s='[email protected];http://www.url.com/?s=NUMBER&a=NUMBER&l=COUNTRY&c=NUMBER&h=NUMBER'
awk -F '[;&?]' '{for (i=1; i<=NF; i++) print $i}' <<< "$s"
[email protected]
http://www.url.com/
s=NUMBER
a=NUMBER
l=COUNTRY
c=NUMBER
h=NUMBER

You can store results in BASH arrays:

arr=( $(awk -F '[;&?]' '{for (i=1; i<=NF; i++) printf "%s ", $i}' <<< "$s") )

Upvotes: 1

Related Questions