Reputation: 76

Efficiency of top vs. sample in Teradata

In Teradata, which is more efficient - Sample or Top?

Since sample is random, does that cause Teradata to do less work and result in faster returns?

Routinely, I just want to see a few rows.

Upvotes: 1

Answers (4)

Filip Uscilowicz

Reputation: 66

I just ran three queries on a large Teradata dataset:

SELECT * FROM table Sample 10;
SELECT * Top 10 FROM table; (with no order by)
SELECT * Top 10 FROM table ORDER BY column;

The DBQL metrics show that by far, the Top 10 with no order is the least resource-intensive. I had about a 99% drop in I/O & CPU just changing from SAMPLE to TOP.

So if your goal is purely efficiency, then TOP without the Order by is the clear winner going by TD's DBQL metrics.

Upvotes: 5

Jason

Reputation: 945

From an efficiency standpoint:

Using Top WITHOUT an Order By will be about the same as Sample.

Using Top WITH an Order By will cause more overhead.

Upvotes: 0

Biswabid

Reputation: 1411

I think its better to use TOP without order by than sample. Both of these evaluate the whole expression before displaying the result.

Reference : http://www.teradatatips.com/2010/10/top-vs-sample.html

Upvotes: -2

ULick

Reputation: 999

TOP usually is the faster one, taking the shortcut out of processing when enough data has been returned.

Sample has a lot more options and works to be statistically correct. To achieve that, it may have to read more data blocks.

Upvotes: 0

Efficiency of top vs. sample in Teradata

Answers (4)

Related Questions