Reputation: 467
Suppose using dbplyr, we have something like
library(dbplyr)
sometable %>%
head()
then we see the first 6 rows.
But if we try this we see an error
sometable %>%
tail()
# Error: tail() is not supported by sql sources
which is expected behaviour of dbplyr:
Because you can’t find the last few rows without executing the whole query, you can’t use tail().
Question: how do we do the tail()
equivalent in this situation?
Upvotes: 1
Views: 477
Reputation: 160447
In general, the order of SQL queries should never be assumed, as the DBMS may store it in an order that is ideal for indexing or other reasons, and not based on the order you want. Because of that, a common "best practice" for SQL queries is to either (a) assume the data is unordered (and perhaps that the order may change, though I've not seen this in practice); or (b) force ordering in the query.
From this, consider arranging your data in a descending manner and use head
.
For instance, if I have a table MyTable
with a numeric field MyNumber
, then
library(dplyr)
library(dbplyr)
tb <- tbl(con, "MyTable")
tb %>%
arrange(MyNumber) %>%
tail() %>%
sql_render()
# Error: tail() is not supported by sql sources
tb %>%
arrange(MyNumber) %>%
head() %>%
sql_render()
# <SQL> SELECT TOP(6) *
# FROM "MyTable"
# ORDER BY "MyNumber"
tb %>%
arrange(desc(MyNumber)) %>%
head() %>%
sql_render()
# <SQL> SELECT TOP(6) *
# FROM "MyTable"
# ORDER BY "MyNumber" DESC
(This is (obviously) demonstrated on a SQL Server connection, but the premise should work just as well for other DBMS types, they'll just shift from SELECT TOP(6) ...
to SELECT ... LIMIT 6
or similar.)
Upvotes: 3