user1514373
user1514373

Reputation: 1

Connect R to Spark through sparklyr

I'm trying to connect R to Spark following the sparklyr tutorial from RStudio: http://spark.rstudio.com/

But some how, I'm getting a weird error message as below. Does anyone knows how to solve this ? I have tried to add the C:\Windows\system32 path to the System Variables Path without any success. Thanks for your help.

> library(sparklyr)
> sc <- spark_connect(master = "local")
Error in sparkapi::start_shell(master = master, spark_home = spark_home,  : 
  Failed to launch Spark shell. Ports file does not exist.
    Path: C:\Users\Gaud\AppData\Local\rstudio\spark\Cache\spark-1.6.1-bin-hadoop2.6\bin\spark-submit.cmd
    Parameters: --jars, "C:\Users\Gaud\Documents\R\win-library\3.3\sparklyr\java\sparklyr.jar", --packages, "com.databricks:spark-csv_2.11:1.3.0","com.amazonaws:aws-java-sdk-pom:1.10.34", sparkr-shell, C:\Users\Gaud\AppData\Local\Temp\RtmpC8MAa8\file322c47ee2a28.out

Upvotes: 0

Views: 2890

Answers (4)

Morteza Mashayekhi
Morteza Mashayekhi

Reputation: 934

Based on https://github.com/rstudio/sparklyr/issues/114, the following worked for me:

sc <- spark_connect(master = "local", config = list())

Upvotes: 0

DSBLR
DSBLR

Reputation: 623

Install the latest sparklyr from github repository.

Steps to install sparklyr, if you don't have internet on your server.

  • Install R packages devtools and git2r
  • Download the master zip file from git
  • Unzip it on a windows path
  • Create a source: source <- devtools:::source_pkg("windows path/master directory name")
  • install(source)

Upvotes: 1

Alex Skorokhod
Alex Skorokhod

Reputation: 540

I had the same problem recently. This bug was discussed at RStudio GitHub sparklyr pages.

Could you please provide your sessionInfo() results? Its output sheds the light on the package versions and OS in use.

2 main points which helped me:

  • Install Spark using spark_install()
  • Install dev. edition of sparklyr using devtools::install_github("rstudio/sparklyr")

Check the version of the sparklyr package. In my case the problem disappeared only after updating to version sparklyr_0.4.11.

Upvotes: 2

Bob Hopez
Bob Hopez

Reputation: 783

First you'll want to make sure you have the most current version of RStudio, if that's what you're using (download and install after closing RStudio from here): https://www.rstudio.com/products/rstudio/download/preview/

    library(DBI)
    library(lazyeval)
    library(dplyr)
    library(devtools)
    # install_github("rstudio/sparkapi")
    library(sparkapi)
    # install_github("rstudio/sparklyr")
    library(sparklyr)
    library(yaml)
    library(nycflights13)

    # Note: Only perform Spark once
    spark_install(version = "1.6.1")

    # Connect to Spark through connection
    sc <- spark_connect(master = "local")
    iris_tbl <- copy_to(sc, iris, "iris", overwrite = TRUE)
    flights_tbl <- copy_to(sc, nycflights13::flights, "flights", overwrite = TRUE)
    class(flights_tbl)


 flights_preview <- DBI::dbGetQuery(sc, "SELECT * FROM flights LIMIT 10")
    flights_preview

Will output this in Windows 10:

# year month day dep_time sched_dep_time dep_delay arr_time sched_arr_time arr_delay carrier flight tailnum origin
# 1  2013     1   1      517            515         2      830            819        11      UA   1545  N14228    EWR
# 2  2013     1   1      533            529         4      850            830        20      UA   1714  N24211    LGA
# 3  2013     1   1      542            540         2      923            850        33      AA   1141  N619AA    JFK
# 4  2013     1   1      544            545        -1     1004           1022       -18      B6    725  N804JB    JFK
# 5  2013     1   1      554            600        -6      812            837       -25      DL    461  N668DN    LGA
# 6  2013     1   1      554            558        -4      740            728        12      UA   1696  N39463    EWR
# 7  2013     1   1      555            600        -5      913            854        19      B6    507  N516JB    EWR
# 8  2013     1   1      557            600        -3      709            723       -14      EV   5708  N829AS    LGA
# 9  2013     1   1      557            600        -3      838            846        -8      B6     79  N593JB    JFK
# 10 2013     1   1      558            600        -2      753            745         8      AA    301  N3ALAA    LGA

Upvotes: 0

Related Questions