SQL reduce data in join or where

I want to know what is faster, assuming I have the following queries and they retrieve the same data

select * from tableA a inner join tableB b on a.id = b.id where b.columnX = value

or

select * from tableA inner join (select * from tableB where b.columnX = value) b on a.id = b.id

I think makes sense to reduce the dataset from tableB in advanced, but I dont find anything to backup my perception.

Upvotes: 0

Views: 974

Answers (4)

Santiago Hitta
Santiago Hitta

Reputation: 75

The DBMS decides the access path that will be used to resolve the query, you can't decide it, but you can do certain things like declaring indexes so that the DBMS takes those indexes into consideration when deciding which access path it will use to resolve the query, and then you will get a better performance.

For instance, in this example you are filtering tableB by b.columnX, normally if there are no indexes declared for tableB the DBMS will have to do a full table scan to determine which rows fulfill that condition, but suppose you declare an index on tableB by columnX, in that case the DBMS will probably consider that index and determine an access path that makes use of the index, getting a much better performance than a full table scan, specially if the table is big.

Upvotes: 0

Mohammed Faisal
Mohammed Faisal

Reputation: 1

Whenever you come across such scenarios wherein you feel that which query would yeild the results faster in teradata, please use the EXPLAIN plan in teradata - which would properly dictate how the PE is going to retrieve records. If you are using Teradata sql assistant then you can select the query and press F6.

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1270583

In a database such as Teradata, the two should have exactly the same performance characteristics.

SQL is not a procedural language. A SQL query describes the result set. It does not specify the sequence of actions.

SQL engines process a query in three steps:

  1. Parse the query.
  2. Optimize the parsed query.
  3. Execute the optimized query.

The second step gives the engine a lot of flexibility. And most query engines will be quite intelligent about ignoring subqueries, using indexes and partitions based on where clauses, and so on.

Upvotes: 3

TFD
TFD

Reputation: 24544

Most SQL dialects compile your query into an execution plan. Teradata and most SQL system show the expected execution plan with the "explain" command. Teradata has a visual explain too, which is simple to learn from

It depends on the data volumes and key type in each table, if any method would be advantageous

Most SQL compilers will work this out correctly using the current table statistics (data size and spread)

In some SQL systems your second command would be worse, as it may force a full temporary table build by ALL fields on tableB

It should be (not that I recommend this query style at all)

select * from tableA inner join (select id from tableB where columnX = value) b on a.id = b.id

In most cases, don't worry about this, unless you have a specific performance issue, and then use the explain commands to work out why

A better way in general is to use common table expressions (CTE) to break the problem down. This leads to better queries that can be tested and maintain over the long term

Upvotes: 0

Related Questions