Daniel
Daniel

Reputation: 1357

Query equivalence evaluation

My question is rooted in T-SQL, SQL Server environment, but its scope is not confined to this technology. I am working on a database with a quite complex business logic, with existing views, stored procedures and new ones to be designed. By means of comparisons of different queries or part of them, I have a strong feeling that there are sections performing the same job with a different arrangement, but of course to refactor the whole mess I need something more that a feeling; so I am trying to determine a way to demonstrate that two statements are equivalent.

An obvious but weak response could be to ascertain that the two queries A and B produce the same recordset: if A is a subset of B and B is a subset of A, they are the same recordset; but I am not sure that this is a good idea because, of course, a recordset is not a query, the results could depend on data and specific parameter values. My questions is: there is a method to prove the equivalence of two different queries? I would say yes, because the optimization performed by the database should works on this. Someone could provide me some pointer to documentation or books digging in this? If there is no general method to prove the equivalence, there is some smart approach based on regression testing performed according to some effective heuristic that does the job?

Edited later: in case, reverse engineering the queries (by hand?) by means of relational algebra, could be a superior method to assess the query equivalence instead of using other queries and / or the computer? There are automated tools helping in performing this "reverse engineering", in case?

Thanks a lot for helping

Upvotes: 1

Views: 855

Answers (5)

Lieven Keersmaekers
Lieven Keersmaekers

Reputation: 58491

I doubt you are going to be able to formally proof or disprove this but my take on this would be to

  • identify all use cases
  • identify all boundary values
  • identify all parameters

and derive a test plan from that. It would require you to

  • create testdata for each case
  • run both queries against that data
  • compare the results

If you don't find any differences after testing, you can be reasonably assured that both statements are equivallent.

Upvotes: 0

wildplasser
wildplasser

Reputation: 44250

You'll need to implement some "canonical query plan" generator for this (an "optimal query plan" as generated by the DBMS can be nondeterministic). In most cases, using alphabetical ordering of terms and tables as a tie-breaker will get you there.

Upvotes: 0

Shaikh Farooque
Shaikh Farooque

Reputation: 2640

Only by the execution plan you can check it. Apart from that i dont think that there is any way to prove this thing.

Upvotes: 0

Frank Schmitt
Frank Schmitt

Reputation: 30815

You probably can't prove it, since the problem seems to be NP-complete; check this SO question on query equivalence (that one is about Oracle, but there are a couple of answers / links that should be relevant for you).

Upvotes: 1

aF.
aF.

Reputation: 66727

You can check the execution plans of the two queries. If they are the same, you have your answer!

Upvotes: 0

Related Questions