Reputation: 1530
I know that sparklyr has the following read file methods:
spark_read_csv
spark_read_parquet
spark_read_json
What about reading orc files? Is it supported yet by this library?
I know I can use read.orc in SparkR or this solution, but I'd like to keep my code in sparklyr.
Upvotes: 3
Views: 1401
Reputation: 330063
You can use low level Spark API in the same way I described in my answer to Transfer data from database to Spark using sparklyr:
library(dplyr)
library(sparklyr)
sc <- spark_connect(...)
spark_session(sc) %>%
invoke("read") %>%
invoke("format", "orc") %>%
invoke("load", path) %>%
invoke("createOrReplaceTempView", name)
df <- tbl(sc, name)
where name
is an arbitrary name used to identify the table
In the current sparklyr
version you should be able to replace above with spark_read_source
:
spark_read_source(sc, name, source = "orc", options = list(path = path))
Upvotes: 5