Reputation: 57
I have this basic sql query which gets an average score and total score count for "level 1" and outputs to csv file
SELECT judge,AVG(score), count(judge)
FROM results
WHERE judge NOT LIKE '% AND %'
AND score >30
AND (class like '%1st%' OR class LIKE '%first%' OR class LIKE '%1 st%')
GROUP BY judge
INTO OUTFILE 'c:/py/2013/all/judges-by-level.csv'
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\r\n';
I want to expand this query to get average scores from multiple "levels" (derived from class column's string value) My current level 1 WHERE looks like:
(class LIKE '%1st%' OR class LIKE '%first%' or class LIKE '%1 st%'
)
I want to change this query to get level 2, level 3 etc. using a similar LIKE
(class LIKE '%2nd%' OR class LIKE '%second%' OR class LIKE '%2 nd%'
)
Original data looks like:
judge , score ,class
John Doe, 55.123 ,First Level Championship
Jane Doe, 65.123 ,1st Amateur Division
John Doe, 75.123 ,Second Level Championship
The output I am going for would be a csv file formatted like this:
judge, level1_avg, level1_count, level2_avg, level2_count
John Doe,55.567,123,51.123,88
Jane Doe,58.123,66,53.123,247
What SQL syntax would I use to get the desired output?
Upvotes: 2
Views: 100
Reputation: 20590
Try a union? Although this won't be efficient after so many levels. (notice the INTO OUTFILE
and ORDER BY
are in the final union clause)
SELECT judge, AVG(score) as 'level1_avg', count(judge) as 'level1_count', '' as 'level2_avg', '' as 'level2_count', '' as 'level3_avg', '' as 'level3_count'
FROM results
WHERE judge NOT LIKE '% AND %'
AND score >30
AND (class like '%1st%' OR class LIKE '%first%' OR class LIKE '%1 st%')
GROUP BY judge
UNION ALL
SELECT judge, '' as 'level1_avg', '' as 'level1_count', AVG(score) as 'level2_avg', count(judge) as 'level2_count', '' as 'level3_avg', '' as 'level3_count'
FROM results
WHERE judge NOT LIKE '% AND %'
AND score >30
AND (class like '%2nd%' OR class LIKE '%second%' OR class LIKE '%2 nd%')
GROUP BY judge
UNION ALL
SELECT judge, '' as 'level1_avg', '' as 'level1_count', '' as 'level2_avg', '' as 'level2_count', AVG(score) as 'level3_avg', count(judge) as 'level3_count'
INTO OUTFILE 'c:/py/2013/all/judges-by-level.csv'
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\r\n';
FROM results
WHERE judge NOT LIKE '% AND %'
AND score >30
AND (class like '%3rd%' OR class LIKE '%third%' OR class LIKE '%3 rd%')
GROUP BY judge
ORDER BY judge
EDIT: added blanks in the appropiate columns in the union
Upvotes: 0
Reputation: 26343
Do you want to kind of cross-tabulate based on first/1st/1 st
and second/2nd/2 nd
?
As a side point, the LIKE ... OR LIKE ... OR LIKE
can be replaced by a regular expression to shorten things.
Assuming I understnd your question correctly, this should be close to what you need:
SELECT
judge,
AVG(case when class rlike '1st|first|1 st' THEN score END) level1_avg,
COUNT(case when class rlike '1st|first|1 st' THEN judge END) level1_count,
AVG(case when class rlike '2nd|second|2 nd' THEN score END) level2_avg,
COUNT(case when class rlike '2nd|second|2 nd' THEN judge END) level2_count
FROM results
where judge not like '% and %' AND score > 30
group by judge
INTO OUTFILE 'c:/py/2013/all/judges-by-level.csv'
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\r\n';
To take apart the level1_avg
result column:
class
contains 1st
or first
or 1 st
, the CASE
returns score
so it's included in the average.class
doesn't contain 1st
or first
or 1 st
, the CASE
returns null
so it's not included in the average.The other aggregated columns use similar logic.
Upvotes: 1