www
www

Reputation: 4391

Cartesian product and WHERE clause issue

Let us have simple table:

CREATE TABLE dbo.test
(
    c1  INT
)

INSERT INTO test (c1) VALUES (1)
INSERT INTO test (c1) VALUES (2)
INSERT INTO test (c1) VALUES (3)

Next calculate some SUM:

SELECT SUM(t1.c1) FROM test AS t1 , test AS t2
WHERE t2.c1 = 1

Output is: 6 . Simple and easy.

But if I run:

SELECT SUM(t1.c1), * FROM test AS t1 , test AS t2
WHERE t2.c1 = 1

The output is:

6   2   2
6   2   3
6   2   1
6   3   2
6   3   3
6   3   1
6   1   2
6   1   3
6   1   1

My question is: Why the second output is not matching the condition in WHERE clause?

Upvotes: 1

Views: 2083

Answers (1)

Michael Fredrickson
Michael Fredrickson

Reputation: 37398

Looks like Sybase implements it's own extensions to GROUP BY:

Through the following extensions, Sybase lifts restrictions on what you can include or omit in the select list of a query that includes group by.

  • The columns in the select list are not limited to the grouping columns and columns used with the vector aggregates.

  • The columns specified by group by are not limited to those non-aggregate columns in the select list.

However, the results of the extension are not always intuitive:

When you use the Transact-SQL extensions in complex queries that include the where clause or joins, the results may become even more difficult to understand.

How does this relate to your problem?

However, the way that Adaptive Server handles extra columns in the select list and the where clause may seem contradictory. For example:

select type, advance, avg(price) 
from titles 
where advance > 5000
group by type

type           advance
-------------  ---------  --------
business        5,000.00      2.99
business        5,000.00      2.99
business       10,125.00      2.99
business        5,000.00      2.99
mod_cook            0.00      2.99
mod_cook       15,000.00      2.99
popular_comp    7,000.00     21.48
popular_comp    8,000.00     21.48
popular_comp        NULL     21.48
psychology      7,000.00     14.30
psychology      2,275.00     14.30
psychology      6,000.00     14.30
psychology      2,000.00     14.30
psychology      4,000.00     14.30
trad_cook       7,000.00     17.97
trad_cook       4,000.00     17.97
trad_cook       8,000.00     17.97



(17 rows affected)

It only seems as if the query is ignoring the where clause when you look at the results for the advance (extended) column. Adaptive Server still computes the vector aggregate using only those rows that satisfy the where clause, but it also displays all rows for any extended columns that you include in the select list. To further restrict these rows from the results, you must use a having clause.

So, to give you the results you would expect, Sybase should allow you to do:

SELECT SUM(t1.c1), * FROM test AS t1 , test AS t2
WHERE t2.c1 = 1
HAVING t2.c1 = 1

The WHERE will exclude the results from the total SUM; the HAVING will hide records that don't match the condition.

Confusing, isn't it?

Instead, you'd probably be better off writing the query so that it doesn't require Sybase's GROUP BY extensions.

Upvotes: 7

Related Questions