Reputation: 1604
Is it possible to limit the number of rows read from a table with an Athena query, and if so how?
I looked at the LIMIT
statement, but it seems LIMIT
doesn't affect how many rows are scanned, but simply limits the number of rows returned from a query. But I'm not looking to reduce the number of rows returned, but restrict the query to be performed on, say, the top N rows in a table. For example, I have a table that has a date column, call it mydate
, that is the same for every row in the table, mytable
. I want to return a single value of mydate
. I could do this with:
select
min(mydate)
from
mytable
but this will perform a full table scan. What I really want is for a query to take the table I provide it and "look" at the first row and return mydate
, so it only has to scan a single record.
I use the SAS statistical software and this can be done in it by running:
data work.temp; /*create a temp dataset in the working directory*/
set mytable(obs=2); /*read in only the first two rows from the mytable table*/
run;
I also use R and in the R language, I could do this even more simply and directly using the following code:
mytable[1, ]$mydate
So, I'm looking for a way, if possible, to accomplish the same thing that SAS and R are doing above.
Thanks.
Upvotes: 0
Views: 250