Reputation: 13753
Dumb question time. Oracle 10g.
Is it possible for a where clause to affect a join?
I've got a query in the form of:
select * from
(select product, product_name from products p
join product_serial ps on product.id = ps.id
join product_data pd on pd.product_value = to_number(p.product_value)) product_result
where product_name like '%prototype%';
Obviously this is a contrived example. No real need to show the table structure as it's all imaginary. Unfortunately, I can't show the real table structure or query. In this case, p.product_value is a VARCHAR2 field which in certain rows have an ID stored inside it rather than text. (Yes, bad design - but something I inherited and am unable to change)
The issue is in the join. If I leave out the where clause, the query works and rows are returned. However, if I add the where clause, I get "invalid number" error on the pd.product_value = to_number(p.product_value) join condition.
Obviously, the "invalid number" error happens when rows are joined which contain non-digits in the p.product_value field. However, my question is how are those rows being selected? If the join succeeds without the outer where clause, shouldn't the outer where clause just select rows from the result of the join? It appears what is happening is the where clause is affecting what rows are joined, despite the join being in an inner query.
Is my question making sense?
Upvotes: 5
Views: 1125
Reputation: 13561
Short answer: yes.
Long answer: the query engine is free to rewrite your query however it wants, as long as it returns the same results. All of the query is available to it to use for the purpose of producing the most efficient query it can.
In this case, I'd guess that there is an index that covers what you are wanting, but it doesn't cover product name, when you add that to the where clause, the index isn't used and instead there's a scan where both conditions are tested at the same time, thus your error.
Which is really an error in your join condition, you shouldn't be using to_number unless you are sure it's a number.
Upvotes: 1
Reputation: 231681
For more background (and a really good read), I'd suggest reading Jonathan Gennick's Subquery Madness.
Basically, the problem is that Oracle is free to evaluate predicates in any order. So it is free to push (or not push) the product_name
predicate into your subquery. It is free to evaluate the join conditions in any order. So if Oracle happens to pick a query plan where it filters out the non-numeric product_value
rows before it applies the to_number
, the query will succeed. If it happens to pick a plan where it applies the to_number
before filtering out the non-numeric product_value
rows, you'll get an error. Of course, it's also possible that it will return the first N rows successfully and then you'll get an error when you try to fetch row N+1 because row N+1 is the first time that it is trying to apply the to_number
predicate to a non-numeric data.
Other than fixing the data model, you could potentially throw some hints into the query to force Oracle to evaluate the predicate that ensures that all the non-numeric data is filtered out before the to_number
predicate is applied. But in general, it's a bit challenging to fully hint a query in a way that will force the optimizer to always evaluate things in the "proper" order.
Upvotes: 0
Reputation: 86735
It affects the plan that's generated.
The actual order that tables are joined (and so filtered) is not dictated by the order you write your query, but by the statistics on the tables.
In one version, the plan generated co-incidentally means that the 'bad' rows never get processed; because the preceding joins filtered the result set down to a point that they're never joined on.
The introduction of the WHERE
clause has meant that ORACLE now believes a different order of join is better (because filtering by the product name requires a certain index, or because it narrows the data down a lot, etc).
This new order means that the 'bad' rows get processed before the join that filters them out.
I would endeavour to clean the data before querying it. Possibly by creating a derived column where the value is already cast to a number, or left as NULL if it is not possible to do so.
You can also use EXPLAIN PLAN to see the different plans being gerenated from your queries.
Upvotes: 2
Reputation: 13655
I guess your to_number(p.product_value)
only applies for rows with a valid product_name
.
What happens is that your join
is applied before your where
clause resulting in the failure of the to_number
function.
What you need to do is include your product_name like '%prototype%'
as a JOIN
clause like this:
select * from
(select product, product_name from products p
join product_serial ps on product.id = ps.id
join product_data pd on product_name like '%prototype%' AND
pd.product_value = to_number(p.product_value));
Upvotes: 0