user1610717
user1610717

Reputation: 325

unable to loop through array in bash for PostgreSQL query export

I have a PostgreSQL query that I'd like to run for multiple geographic areas via a loop. I want to use the elements in the array to modify the query and the name of the csv file where I'm exporting the data to. So in essence, I want the query to run on ...cwa = 'MFR'... and export to hourly_MFR.csv, then run on ...cwa = 'PQR'... and export to hourly_PQR.csv, and so on.

Here's what I have so far. I thought maybe the EOF in the script might be causing problems, but I couldn't figure out how to get the loop to work while maintaining the general format of the script.

Also, the query/script, without the looping (excluding declare, for, do, done statements) works fine.

dbname="XXX"

username="XXXXX"

psql $dbname $username << EOF

declare -a arr=('MFR', 'PQR', 'REV')

for i in "${arr[@]}"

do

\COPY
(SELECT d.woyhh,
      COALESCE(ct.ct, 0) AS total_count
FROM
 (SELECT f_woyhh(d::TIMESTAMP) AS woyhh
  FROM generate_series(TIMESTAMP '2018-01-01', TIMESTAMP '2018-12-31', interval '1 hour') d) d
LEFT JOIN
 (SELECT f_woyhh((TIME)::TIMESTAMP) AS woyhh,
         count(*) AS ct
  FROM counties c
  JOIN ltg_data d ON ST_contains(c.the_geom, d.ltg_geom)
  WHERE cwa = $i
  GROUP BY 1) ct USING (whh)
ORDER BY 1) TO /var/www/html/GIS/ltg_db/bigquery/hourly_$i.csv CSV HEADER;

done

EOF

Thanks for any help!

Upvotes: 0

Views: 478

Answers (2)

sygibson
sygibson

Reputation: 389

In @Lienhart Woitok's answer above, the solution will definitely work. However - note that this has the side effect of executing a new 'psql' call, database connection setup, authentication, and subsequent response returned; followed by closing the connection - for each iteration of the loop.

In this case you are only running 3 iterations of the loop, so it may not be a significant issue. However, if you expand the usage to run more iterations, you may want to optimize this to only run a single DB connection and batch query it.

To do that, use of a temporary working file to build the SQL commands may be necessary. There are other ways, but this is relatively simple to use and debug:

QUERY_FILE=$(mktemp /tmp/query.XXXXXXX)

# note the use of an array isn't really necessary in this use
# case - and a simple set of values can be used equally as well
CWA="MFR PQR REV"

for i in $CWA
do
 cat <<EOF >> $QUERY_FILE
     <ADD_YOUR_QUERY_STATEMENTS_HERE>
EOF
done

psql --file=$QUERY_FILE $dbname $username

if (( $? )) 
then
  echo "query failed (QUERY_FILE: ($QUERY_FILE')"
  exit 1 
else
  echo "query succeeded"
  rm -f $QUERY_FILE
  exit 0
fi

Upvotes: 1

Lienhart Woitok
Lienhart Woitok

Reputation: 426

I think you are nearly there, you just have to reorder some lines. Try this:

dbname="XXX"

username="XXXXX"

declare -a arr=('MFR', 'PQR', 'REV')

for i in "${arr[@]}"
do

psql $dbname $username << EOF

\COPY
(SELECT d.woyhh,
      COALESCE(ct.ct, 0) AS total_count
FROM
 (SELECT f_woyhh(d::TIMESTAMP) AS woyhh
  FROM generate_series(TIMESTAMP '2018-01-01', TIMESTAMP '2018-12-31', interval '1 hour') d) d
LEFT JOIN
 (SELECT f_woyhh((TIME)::TIMESTAMP) AS woyhh,
         count(*) AS ct
  FROM counties c
  JOIN ltg_data d ON ST_contains(c.the_geom, d.ltg_geom)
  WHERE cwa = $i
  GROUP BY 1) ct USING (whh)
ORDER BY 1) TO /var/www/html/GIS/ltg_db/bigquery/hourly_$i.csv CSV HEADER;

EOF

done

The declare and the for loop are part of the bash script while everything between <<EOF and EOF are part of your Postgresql query.

Upvotes: 2

Related Questions