Raphael K
Raphael K

Reputation: 2353

Can't read csv into Spark using spark_read_csv()

I'm trying to use sparklyr to read a csv file into R. I can read the .csv into R just fine using read.csv(), but when I try to use spark_read_csv() it breaks down.

accidents <- spark_read_csv(sc, name = 'accidents', path = '/home/rstudio/R/Shiny/accident_all.csv')

However, when I attempt to execute this code I receive the following error:

Error in as.hexmode(xx) : 'x' cannot be coerced to class "hexmode"

I haven't found much by Googling that error. Can anyone shed some light onto what is going on here?

Upvotes: 4

Views: 7618

Answers (1)

Koushik Khan
Koushik Khan

Reputation: 179

Yes, local .csv files can be read easily in Spark Data frame using spark_read_csv(). I have a .csv file in Documents directory and I have read it using the following code snippet. I thing there is no need to use file:// prefix. Below is the snippet:

Sys.setenv(SPARK_HOME = "C:/Spark/spark-2.0.1-bin-hadoop2.7/")
library(SparkR, lib.loc = "C:/Spark/spark-2.0.1-bin-hadoop2.7/R/lib")
library(sparklyr)
library(dplyr)
library(data.table)
library(dtplyr)

sc <- spark_connect(master = "local", spark_home = "C:/Spark/spark-2.0.1-bin-hadoop2.7/", version = "2.0.1")

Credit_tbl <- spark_read_csv(sc, name = "credit_data", path = "C:/Users/USER_NAME/Documents/Credit.csv", header = TRUE, delimiter = ",")

You can see the dataframe just by calling the object name Credit_tbl. enter image description here

Upvotes: 4

Related Questions