SHR
SHR

Reputation: 8313

MySQL: Does unified select is always faster than seperated simple selects?

I have 3 tables:

# config - only 1 line in table
# Config (maxCount int) 
Create Table Config(maxCount int)  
# files- 20k lines in table
# Files(fileid int (pk), filename varchar (unique), revCount int) 
Create Table Files ( 
   fileid int(10) unsigned not null auto_increment,
   filename varchar(255) not null,
   revCount int(10) unsigned default 1,
   primary key(fileid),
   unique key filename(filename) 
);
# Revitions - 26k lines in table
# Revisions(revid (pk),  fileid int (fk), revname varchar,...) 
create table Revisions( 
   revid int(10) unsigned not null auto_increment,
   fileid int(10) unsigned not null, 
   revname varchar(255) not null,
   mtime  timestamp default CURRENT_TIMESTAMP,
   deleted boolean default false,
   primary key(revid),
   KEY fr_fileid_fk_idx (fileid),
   CONSTRAINT fr_fileid_fk FOREIGN KEY(fileid)
     REFERENCE Files(fileid)
     ON DELETE CASCADE ON UPDATE CASCADE 
);

and have the following stored procedure:

create procedure test1(in file_id int,out max_count int, out rev_count int, out last_revid int)
begin 
  select maxCount into max_count from Config limit 1;
  select revCount into rev_count from Files where fileid=file_id;
  select max(revid) into last_revid from Revitions where fileid=file_id;
end

I checked performance of test1 by calling it for each fileid in files (20k) calls it took approximaly 2.7 seconds to run

the test proc is:

create procedure t1()
begin
  declare done int default 0;
  declare file_id,last_revid,max_count,rev_cout int default 0;
  declare c1 cursor for select fileid from files;
  declare continue handle for not found select 1 into done from (select 1) as t;
  open c1;
  read1: loop
    fetch c1 into file_id;
    if done=1 then
      leave read1;
    end if;
    call test1(file_id, max_count,rev_cout,last_revid);
  end loop;
  close c1;
end 

I've tried other solution, to unify the 3 selects into 1 query, like the following:

create procedure test2(in file_id int,out max_count int, out rev_count int, out last_revid int)
begin 
  select maxCount, revCount, max(revid) 
  into   max_count,rev_count,last_revid 
  from Config, Files, Revitions 
  where Files.fileid=file_id AND Revitions.fileid=file_id
  limit 1;
end

i change t1 to t2 by changing the call test1 to call test2

the result was a significantly better performance, the second (t2) took like 2 seconds (more efficient by 25%!), I've repeat the tests many times the result was always the same.

is there any reson why?

I thought the join of the tables is less efficient then select from each table seperatly but obviously the second was faster.

so can I count on it and always prefer to unify select queries from several tables or should I check for each case which is the best?

Upvotes: 0

Views: 45

Answers (2)

Rick James
Rick James

Reputation: 142296

As long as you are comparing techniques, give this a try:

SELECT
  ( select maxCount FROM Config limit 1 ) AS max_count,
  ( select revCount FROM Files where fileid=file_id ) AS rev_count,
  ( select max(revid) FROM Revitions where fileid=file_id ) AS last_revid ;

(No Stored Proc needed.)

Upvotes: 1

spencer7593
spencer7593

Reputation: 108410

It's not always faster to run a single combined query than three separate queries. Sometimes it is, other times not.

Q: is there any reason why [a single combined query is faster than three individual queries]?

A: There is overhead for running a query from a procedure, just like there is from any client. There's a context switch from procedure to the database engine, and the database engine overhead for the statement... parse, syntax check, semantic check, determining an access plan, executing the query, materializing the result and returning it to the client, as well as the cleanup when the statement is closed.

We can avoid some of this overhead using a single combined statement, one roundtrip to the database, in place of three separate roundtrip statement executions.

But there isn't any guarantee that a combined statement will be faster. It really depends on the execution plan of the statements, and how much time it takes to execute each query.

If the execution time of the single combined query is significantly higher than the total execution time of the three separate statements, that will effectively negate any overhead savings, and can be slower overall.


Q: Can I count on it and always prefer to unify select queries from several tables or should I check for each case which is the best?

A: Frequently a combined query will be faster. But in the generalized case, we can't "count on" it. We need to check each case.

Upvotes: 2

Related Questions