Reputation: 1357
My question is rooted in T-SQL, SQL Server environment, but its scope is not confined to this technology. I am working on a database with a quite complex business logic, with existing views, stored procedures and new ones to be designed. By means of comparisons of different queries or part of them, I have a strong feeling that there are sections performing the same job with a different arrangement, but of course to refactor the whole mess I need something more that a feeling; so I am trying to determine a way to demonstrate that two statements are equivalent.
An obvious but weak response could be to ascertain that the two queries A and B produce the same recordset: if A is a subset of B and B is a subset of A, they are the same recordset; but I am not sure that this is a good idea because, of course, a recordset is not a query, the results could depend on data and specific parameter values. My questions is: there is a method to prove the equivalence of two different queries? I would say yes, because the optimization performed by the database should works on this. Someone could provide me some pointer to documentation or books digging in this? If there is no general method to prove the equivalence, there is some smart approach based on regression testing performed according to some effective heuristic that does the job?
Edited later: in case, reverse engineering the queries (by hand?) by means of relational algebra, could be a superior method to assess the query equivalence instead of using other queries and / or the computer? There are automated tools helping in performing this "reverse engineering", in case?
Thanks a lot for helping
Upvotes: 1
Views: 855
Reputation: 58491
I doubt you are going to be able to formally proof or disprove this but my take on this would be to
and derive a test plan from that. It would require you to
If you don't find any differences after testing, you can be reasonably assured that both statements are equivallent.
Upvotes: 0
Reputation: 44250
You'll need to implement some "canonical query plan" generator for this (an "optimal query plan" as generated by the DBMS can be nondeterministic). In most cases, using alphabetical ordering of terms and tables as a tie-breaker will get you there.
Upvotes: 0
Reputation: 2640
Only by the execution plan you can check it. Apart from that i dont think that there is any way to prove this thing.
Upvotes: 0
Reputation: 30815
You probably can't prove it, since the problem seems to be NP-complete; check this SO question on query equivalence (that one is about Oracle, but there are a couple of answers / links that should be relevant for you).
Upvotes: 1
Reputation: 66727
You can check the execution plans of the two queries. If they are the same, you have your answer!
Upvotes: 0