Reputation: 533
I test performance for different types of select from external DB source.
I'm interested in performance, because only 3rd type (EXECUTE) is effective in case of WHERE statement.
Am I doing something wrong, or that's normal that U-SQL first read all rows from external table and then filter it inside ADLA (the same behaviour for LOCATION)?
That's a problem/ineffective in case my table is very large and I need use just part of the table rows.
Can I force U-SQL to filter data before reading from EXTERNAL table or from LOCATION? The problem is I need dynamic WHERE statement based on variable.
Upvotes: 1
Views: 93
Reputation: 6684
First you control the ability to push predicates to your SQL Server engine with the REMOTABLE_TYPES
clause on your DATA SOURCE
object.
Then the predicate needs to be remotable. If you are doing a predicate with a join with a U-SQL rowset (table), then it may not be easy to remote it efficiently (I am not sure if we map a join into a semijoin yet).
Since you seem to be able to remote the predicate you use in the EXECUTE
, I would think that there is a good chance that you could write the queries in the other cases in a way that they can be remoted. But without seeing the queries, it is hard to say for sure.
If you want us to take a look, please contact me by email (usql at microsoft dot com).
Upvotes: 2