Reputation: 4109
I'm trying to understand how to use dplyr
on a remote database vs data stored in R. Namely, I'm unclear on what functions can be used to with mutate()
. For example, this works just fine:
diamonds %>%
select(color, cut, price) %>%
mutate(
newcol = paste0(cut, color)
)
However, if I try to use paste()
on a remote database (that is too large to store locally) I get an error saying
Error in postgresqlExecStatement(conn, statement, ...) :
RS-DBI driver: (could not Retrieve the result : ERROR: function paste0()
That's one example, but I noticed a similar error when trying to use POSIXct dates and other functions from non-base R.
My question: Am I limited to only using very basic aggregating functions like are mentioned here? If not, how does one implement other functions (custom, additional libraries etc.) through dplyr on remote databases?
Upvotes: 4
Views: 451
Reputation: 675
yes, dplyr
uses the dbplyr
package for the SQL translations. In it, we have to manually specify how each R command translates to that specific SQL syntax, so in some cases one function may work for one database and not for others. I just checked the translation for PostgreSQL and it looks like we have a translation for paste()
but not paste0()
. In the meanwhile, you can also pass SQL commands inside dplyr
verbs, for example, mutate(m = strpos(field1, "search"))
will run the PostgreSQL strpos
that is used to locate a string within a field.
Upvotes: 2