Reputation: 325
I have a PostgreSQL query that I'd like to run for multiple geographic areas via a loop. I want to use the elements in the array to modify the query and the name of the csv file where I'm exporting the data to. So in essence, I want the query to run on ...cwa = 'MFR'... and export to hourly_MFR.csv, then run on ...cwa = 'PQR'... and export to hourly_PQR.csv, and so on.
Here's what I have so far. I thought maybe the EOF in the script might be causing problems, but I couldn't figure out how to get the loop to work while maintaining the general format of the script.
Also, the query/script, without the looping (excluding declare, for, do, done statements) works fine.
dbname="XXX"
username="XXXXX"
psql $dbname $username << EOF
declare -a arr=('MFR', 'PQR', 'REV')
for i in "${arr[@]}"
do
\COPY
(SELECT d.woyhh,
COALESCE(ct.ct, 0) AS total_count
FROM
(SELECT f_woyhh(d::TIMESTAMP) AS woyhh
FROM generate_series(TIMESTAMP '2018-01-01', TIMESTAMP '2018-12-31', interval '1 hour') d) d
LEFT JOIN
(SELECT f_woyhh((TIME)::TIMESTAMP) AS woyhh,
count(*) AS ct
FROM counties c
JOIN ltg_data d ON ST_contains(c.the_geom, d.ltg_geom)
WHERE cwa = $i
GROUP BY 1) ct USING (whh)
ORDER BY 1) TO /var/www/html/GIS/ltg_db/bigquery/hourly_$i.csv CSV HEADER;
done
EOF
Thanks for any help!
Upvotes: 0
Views: 478
Reputation: 389
In @Lienhart Woitok's answer above, the solution will definitely work. However - note that this has the side effect of executing a new 'psql' call, database connection setup, authentication, and subsequent response returned; followed by closing the connection - for each iteration of the loop.
In this case you are only running 3 iterations of the loop, so it may not be a significant issue. However, if you expand the usage to run more iterations, you may want to optimize this to only run a single DB connection and batch query it.
To do that, use of a temporary working file to build the SQL commands may be necessary. There are other ways, but this is relatively simple to use and debug:
QUERY_FILE=$(mktemp /tmp/query.XXXXXXX)
# note the use of an array isn't really necessary in this use
# case - and a simple set of values can be used equally as well
CWA="MFR PQR REV"
for i in $CWA
do
cat <<EOF >> $QUERY_FILE
<ADD_YOUR_QUERY_STATEMENTS_HERE>
EOF
done
psql --file=$QUERY_FILE $dbname $username
if (( $? ))
then
echo "query failed (QUERY_FILE: ($QUERY_FILE')"
exit 1
else
echo "query succeeded"
rm -f $QUERY_FILE
exit 0
fi
Upvotes: 1
Reputation: 426
I think you are nearly there, you just have to reorder some lines. Try this:
dbname="XXX"
username="XXXXX"
declare -a arr=('MFR', 'PQR', 'REV')
for i in "${arr[@]}"
do
psql $dbname $username << EOF
\COPY
(SELECT d.woyhh,
COALESCE(ct.ct, 0) AS total_count
FROM
(SELECT f_woyhh(d::TIMESTAMP) AS woyhh
FROM generate_series(TIMESTAMP '2018-01-01', TIMESTAMP '2018-12-31', interval '1 hour') d) d
LEFT JOIN
(SELECT f_woyhh((TIME)::TIMESTAMP) AS woyhh,
count(*) AS ct
FROM counties c
JOIN ltg_data d ON ST_contains(c.the_geom, d.ltg_geom)
WHERE cwa = $i
GROUP BY 1) ct USING (whh)
ORDER BY 1) TO /var/www/html/GIS/ltg_db/bigquery/hourly_$i.csv CSV HEADER;
EOF
done
The declare
and the for
loop are part of the bash script while everything between <<EOF
and EOF
are part of your Postgresql query.
Upvotes: 2