user295944
user295944

Reputation: 373

Sparklyr read database table to distributed DF

Hi I am trying to figure out if there is a way to directly read a DB table to a sparkR dataframe. I have rstudio installed on an EMR cluster which has my hive metastore on it.

I know I can do the following:

library(sparklyr)
library(dplyr)
sc <- spark_connect(master = "local")
library(DBI)
query <- "select * from schema.table"
result <- dbGetQuery(sc, query) 
result_t <- copy_to(sc,result)

but is there a way to query directly into result_t?

Upvotes: 0

Views: 1421

Answers (1)

user295944
user295944

Reputation: 373

Like @kevinykuo suggested,

result_t <- tbl(sc, "schema.table")

Upvotes: 2

Related Questions