Reputation: 27476

MySQL Query - getting missing records when using group-by

I have a query :

select score, count(1) as 'NumStudents' from testresults where testid = 'mytestid'
group by score order by score

where testresults table contains the performances of students in a test. A sample result looks like the following, assuming maximum marks of the test is 10.

score, NumStudents

0 10
1 20
2 12
3 5
5 34
..
10 23

As you can see, this query does not return any records for scores which no student have scored. For eg. nobody scored 4/10 in the test and there are no records for score = 4 in the query output.

I would like to change the query so that I can get these missing records with 0 as the value for the NumStudents field. So that my end output would have max + 1 records, one for each possible score.

Any ideas ?

EDIT:

The database contains several tests and the maximum marks for the test is part of the test definition. So having a new table for storing all possible scores is not feasible. In the sense that whenever I create a new test with a new max marks, I need to ensure that the new table should be changed to contain these scores as well.

Upvotes: 0

Answers (4)

J.D. Fitz.Gerald

Reputation: 2957

Just as a mental exercise I came up with this to generate a sequence in MySQL. As long as the number of tables in all databases on the box squared are less than the total length of the sequence it will work. I wouldn't recommend it for production though ;)

SELECT @n:=@n+1 as n from (select @n:=-1) x, Information_Schema.Tables y, Information_Schema.Tables WHERE @n<20; /* sequence from 0 to 20 inclusive */

Upvotes: 0

j_random_hacker

Reputation: 51226

Does MySQL support set-returning functions? Recent releases of PostgreSQL have a function, generate_series(start, stop) that produces the value start on the first row, start+1 on the second, and so on up to stop on the stopth row. The advantage of this is that you can put this function in a subselect in the FROM clause and then join to it, instead of creating and populating a table and joining to that as suggested by le dorfier and Bill Karwin.

Upvotes: 0

Bill Karwin

Reputation: 562398

SQL is good at working with sets of data values in the database, but not so good at sets of data values that are not in the database.

The best workaround is to keep one small table for the values you need to range over:

CREATE TABLE ScoreValues (score int);
INSERT INTO ScoreValues (score) 
  VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9), (10);

Given your comment that you define the max marks of a test in another table, you can join to that table in the following way, as long as ScoreValues is sure to have values at least as high or higher than the greatest test's max marks:

SELECT v.score, COUNT(tr.score) AS 'NumStudents'
FROM ScoreValues v 
  JOIN Tests t ON (v.score <= t.maxmarks)
  LEFT OUTER JOIN TestResults tr ON (v.score = tr.score AND t.testid = tr.testid)
WHERE t.testid = 'mytestid'
GROUP BY v.score;

Upvotes: 2

dkretz

Reputation: 37655

The most obvious way would be to create a table named "Scores" and left outer join your table to it.

SELECT s.score, COUNT(1) AS scoreCount
FROM score AS s
LEFT OUTER JOIN testScores AS ts
ON s.score = ts.score
GROUP BY s.score

If you don't want to create the table, you could use

SELECT
1 as score, SUM(CASE WHEN ts.score = 1 THEN 1 ELSE 0 END) AS scoreCount,
2 as score, SUM(CASE WHEN ts.score = 2 THEN 1 ELSE 0 END) AS scoreCount,
3 as score, SUM(CASE WHEN ts.score = 3 THEN 1 ELSE 0 END) AS scoreCount,
4 as score, SUM(CASE WHEN ts.score = 4 THEN 1 ELSE 0 END) AS scoreCount,
... 10 as score, SUM(CASE WHEN ts.score = 10 THEN 1 ELSE 0 END) AS scoreCount
FROM testScores AS ts

Upvotes: 1

MySQL Query - getting missing records when using group-by

Answers (4)

Related Questions