Reputation: 61
I know that there're several existing questions like this one, but mine is little bit complicated (for me).
For example, I have index, data, log and flag these four tables.
index: id program compiler flag_id data_id
data: id machine runtime date index_id
log: index_id data_id log
flag: id flag_name
The flag_id, data_id and index_id indicate the corresponding table's main id. Those main id is auto incremental and unique in index and data tables. But flag table's id could be duplicated. For example, a program may use different flags to compile:
index: 123 jacobi gcc 11 345
data: 345 host1 3:21 2015-05-13 22:56:12 123
log: 123 345 "PASS"
flag: 11 "-g"
11 "-mp"
11 "-static"
So this program is compiled like:
gcc -g -mp -static jacobi.c
and the execution time is 3'21'' and finished by 2015-05-13 22:56:12.
The thing is, many people may run this program and may be due to various of reasons, the exact the same test may PASS or FAIL. And only the last run (based on the timestamp--date or the MAX(data_id) or the MAX(index_id) for the same test) will be showed in the webpage. How to determine whether tests are the same (DISTINCT)? These field should be the same:
program, compiler and all the flags (no order issue)
So, you could see, the flags are stored in the flag table. I don't know how to correlate the index and flag table to find the same test.
Final example to show my expected output:
index: 123 jacobi gcc 11 345
data: 345 host1 3:21 2015-05-13 22:56:12 123
log: 123 345 "PASS"
flag: 11 "-g"
11 "-mp"
11 "-static"
index: 126 jacobi gcc 15 397
data: 397 host2 3:14 2015-05-13 23:13:28 126
log: 126 397 "FAIL"
flag: 15 "-g"
15 "-mp"
15 "-static"
Here, for these two records, they are the same test since they're using the same compiler to compile the same program with the same flags. But the second one is latest one. Because we can tell either from index_id, data_id or finishing date.
If there're 100000 records in my database and maybe 30000 of them are the same tests, how could I get these 30000 tests with their last run result, PASS or FAIL? Like:
gcc -g -mp -static jacobi.c FAIL
gcc -g -mp jacobi.c PASS
gcc -O3 mm.c PASS
...
I know it's wordy, your patience and any help are really appreciated!
Upvotes: 1
Views: 70
Reputation: 9010
I appear to be attracted to self-flagellation. Anyway, first thing is first - assembling your flags into a string, and getting the results.:
select i.compiler, group_concat(f.flag_name order by f.flag_name asc separator ' ') flags, i.program, d.date, l.log
from `index` i
inner join `flag` f
on i.flag_id = f.id
inner join `data` d
on i.data_id = d.id
inner join `log` l
on i.id = l.index_id
group by i.id
the order by
in the group_concat
ensures the same set of flags will always appear in the same order.
This should give us a result that looks like compiler | flag1 flag 2 ... | program | date
Next step is to group by compiler, the flags and the program, and get the max(date)
. We do this by querying the results of the first query.
select compiler, flags, program, max(`date`) max_date
from (
select i.compiler, group_concat(f.flag_name order by f.flag_name asc separator ' ') flags, i.program, d.date
from `index` i
inner join `flag` f
on i.flag_id = f.id
inner join `data` d
on i.data_id = d.id
group by i.id ) q
group by compiler, flags, program;
We now know the most recent run time for each combo of compiler, flags and program.
We now need to join our first query back to this one, to find only the row with the max date, and the associated data:
select q1.* from (
select i.compiler, group_concat(f.flag_name order by f.flag_name asc separator ' ') flags, i.program, d.date, l.log
from `index` i
inner join `flag` f
on i.flag_id = f.id
inner join `data` d
on i.data_id = d.id
inner join `log` l
on i.id = l.index_id
group by i.id
) q1
inner join (
select compiler, flags, program, max(`date`) max_date
from (
select i.compiler, group_concat(f.flag_name order by f.flag_name asc separator ' ') flags, i.program, d.date
from `index` i
inner join `flag` f
on i.flag_id = f.id
inner join `data` d
on i.data_id = d.id
group by i.id ) q
group by compiler, flags, program ) q2
on q1.compiler = q2.compiler
and q1.flags = q2.flags
and q1.program = q2.program
and q1.`date` = q2.max_date
sql fiddle is broken right now so i can't test this - but it should be pretty close.
Upvotes: 1