Reputation: 2257
I've the following file queries.sql
that contains a number of queries, structured like this:
/* Query 1 */
SELECT cab_type_id,
Count(*)
FROM trips
GROUP BY 1;
/* Query 2 */
SELECT passenger_count,
Avg(total_amount)
FROM trips
GROUP BY 1;
/* Query 3 */
SELECT passenger_count,
Extract(year FROM pickup_datetime),
Count(*)
FROM trips
GROUP BY 1,
2;
Then I've written a regex, that finds all those queries in the file:
/\*[^\*]*\*/[^;]*;
What I'd like to achieve is the following:
EXPLAIN ANALYZE
q1.txt
with the corresponding output, query 2 create q2.txt
etc.One of my main challenges (there are no problems, right? ;-)) is, that I'm rather unfamiliar with the linux bash I've to use.
I tried cat queries.sql | grep '/\*[^\*]*\*/[^;]*;'
but that doesn't return anything.
So a solution could look like:
count = 0
for query in (cat queries.sql | grep 'somehow-here-comes-my-regex') do
count = $count+1
query = 'EXPLAIN ANALYZE '+query
psql -U postgres -h localhost -d nyc-taxi-data -c query > 'q'$count'.txt'
Except from: that doesn't work and I don't know how to make it work.
Upvotes: 1
Views: 1012
Reputation: 203645
It sounds like this is what you're looking for:
awk -v RS= -v ORS='\0' '{print "EXPLAIN ANALYZE", $0}' queries.sql |
while IFS= read -r -d '' query; do
psql -U postgres -h localhost -d nyc-taxi-data -c "$query" > "q$((++count)).txt"
do
The awk statement outputs each query as a NUL-terminated string, the shell loop reads it as such one at a time and calls psql on it. Simple, robust, efficient, etc...
Upvotes: 0
Reputation: 1528
You have to omit spaces for variable assignments.
The following script would help. Save it in a file eg.: explain.sh
, make it executable using chmod 0700 explain.sh
and run in the following way: ./explain.sh query.sql
.
#!/bin/bash
qfile="$1"
# number of queries
n="$(grep -oP '(?<=Query )[0-9]+ ' $qfile)"
count=1
for q in $n; do
# Corrected solution, modified after the remarks of @EdMorton
qn="EXPLAIN ANALYZE $(awk -v n="Query $q" 'flag; $0 ~ n {flag=1} /;/{flag=0}' $qfile)"
#qn="EXPLAIN ANALYZE $(awk -v n=$q "flag; /Query $q/{flag=1} /;/{flag=0}" $qfile)"
# psql -U postgres -h localhost -d nyc-taxi-data -c "$qn" > q$count.txt
echo "$qn" > q$count.txt
count=$(( $count + 1 ))
done
First of all, the script accounts for one argument (your example input query.sql
file). It reads out the number of queries and save into a variable n
. Then in a for
loop it iterates through the query numbers and uses awk to extract the number n
query and append EXPLAIN ANALYZE to the beginning. Then you can run your psql
with the desired query. Here I commented out the psql part. This example script only creates qN.txt files for each explain query.
UPDATE:
The awk
part: It is possible to use a shell variable in awk using the -v
flag. Here we creates an awk variable n
with the value of the q
shell variable. n
is used to create the starter pattern ie: Query 1. awk -v n="Query $q" 'flag; $0 ~ n {flag=1} /;/{flag=0}' $qfile
matches everything between Query 1
and the first occurence of a semi-colon (;
) excluding the line of Query 1
from query.sql
. The $(...)
means command-substitution in bash, thus we can save the output of a shell command into a variable. Here we save the output of awk and prefix it with the EXPLAIN ANALYZE string.
Here is a great answer about awk pattern matching.
Upvotes: 1