Nancy
Nancy

Reputation: 4109

Using dplyr on local vs remote databases

I'm trying to understand how to use dplyr on a remote database vs data stored in R. Namely, I'm unclear on what functions can be used to with mutate(). For example, this works just fine:

diamonds %>%
    select(color, cut, price) %>%
    mutate(
        newcol = paste0(cut, color)
)

However, if I try to use paste() on a remote database (that is too large to store locally) I get an error saying

Error in postgresqlExecStatement(conn, statement, ...) : 
  RS-DBI driver: (could not Retrieve the result : ERROR:  function paste0()

That's one example, but I noticed a similar error when trying to use POSIXct dates and other functions from non-base R.

My question: Am I limited to only using very basic aggregating functions like are mentioned here? If not, how does one implement other functions (custom, additional libraries etc.) through dplyr on remote databases?

Upvotes: 4

Views: 451

Answers (1)

edgararuiz
edgararuiz

Reputation: 675

yes, dplyr uses the dbplyr package for the SQL translations. In it, we have to manually specify how each R command translates to that specific SQL syntax, so in some cases one function may work for one database and not for others. I just checked the translation for PostgreSQL and it looks like we have a translation for paste() but not paste0(). In the meanwhile, you can also pass SQL commands inside dplyr verbs, for example, mutate(m = strpos(field1, "search")) will run the PostgreSQL strpos that is used to locate a string within a field.

Upvotes: 2

Related Questions