R007
R007

Reputation: 123

Use of dbplyr for in-database processing vs directly writing SQL code

We are planning to explore use of dbplyr for Snowflake and would like get more insight.

  1. dbplyr converting SQL code for in-database processing will be equally efficient than directly writing SQL code ?
  2. does dbplyr supports complex multi-level (at least 3 level) sub-query ?

Over question is that ....Does it make sense to use dbplyr for enterprise level application for big data analysis ?

Upvotes: 1

Views: 259

Answers (1)

Simon.S.A.
Simon.S.A.

Reputation: 6941

dbplyr works by translating dplyr code into SQL. This has some advantages:

  • elegance of R tools
  • access to other parts of R programming language (e.g. loops, plots, packages)
  • the same code can be translated to a different database type

and some disadvantages:

  • not all R commands have translations defined
  • some restrictions on how R code can be written for translation
  • presentation of translated commands is less elegant than human written SQL code
  • error messages in R are not always sufficient to debug problems that occur in the database

Whether it is right for your application is a practical question you will need to test.

Regarding sub-queries:

  • dbplyr uses sub-queries throughout its translation, but not in the same way developers write sub-queries
  • SQL has an upper limit on the number of sub-queries it can handle, if you write your dplyr code badly then you can encounter this limit

Other answers you might find relevant:

Upvotes: 1

Related Questions