Charles
Charles

Reputation: 73

Limiting the number of rows in subqueries with Teradata

I'm new to Teradata and I'm facing a problem I didn't have with the previous database I used. Basically, I'm trying to reduce the number of rows returned in subqueries inside a where clause. I had no problem doing this previously with the ROWNUM function.

My previous query was something like:

SELECT * FROM myTable
WHERE field1 = 'foo' AND field2 in(
    SELECT field2 FROM anotherTable
    WHERE field3 = 'bar' AND ROWNUM<100);

Since I can't use ROWNUM in TD, I've looked for equivalent functions or at least functions that would get me where I wanted even if they were'nt exactly equivalent. I found and tried : ROW_NUMBER, TOP and SAMPLE.

I tried ROW_NUMBER() but Teradata doesn't allow analytic functions in WHERE clauses. I tried TOP N but this option is not supported in a subquery. I tried SAMPLE N but it is not supported in subqueries either.

So... I have to admit I'm a bit stuck right now and was wondering if there was any solution that would allow me to limit the number of rows returned in a subquery using Teradata and that would be pretty similar to what I did up to now? Also, if there aren't any, how would it be possible to build the query differently to use it appropriately with Teradata?

Thanks!

Upvotes: 7

Views: 25966

Answers (1)

dnoeth
dnoeth

Reputation: 60462

The limited usage of SAMPLE or TOP in a subquery is probably because this might be a Correlated Subquery.

But there are two workarounds.

Put SAMPLE or TOP in a Derived Table within the subquery (so this can't be correlated anymore):

SELECT * FROM myTable
WHERE field1 = 'foo'
AND field2 IN (
     SELECT * FROM
       ( SELECT field2 FROM anotherTable -- or TOP 100
         WHERE field3 = 'bar'  SAMPLE 100
       ) AS dt
    );

Or rewrite it as a join to a Derived Table:

SELECT * FROM myTable
JOIN ( SELECT DISTNCT field2 FROM anotherTable -- can't use TOP if you need DISTINCT 
         WHERE field3 = 'bar' SAMPLE 100
       ) AS dt
WHERE field1 = 'foo'
AND myTable.field2 = dt.field1;

TOP without ORDER BY is quite similar to ROWNUM. It's not random at all, but running it a 2nd time might still return a different result set.

SAMPLE is truly random, every time returning a different result.

ROW_NUMBER is also possible using QUALIFY instead of WHERE, but OLAP functions always need some ORDER BY, so is much more overhead.

Upvotes: 6

Related Questions