Running correlations in SparkR: no method for coercing this S4 class to a vector

Question

I've recently started using SparkR and would like to run some correlation analysis with it. I'm able to upload content in as a SparkR dataframe but it doesn't permit to run simple cor() analysis with the data frame. (Getting an S4 error below):

usr/local/src/spark/spark-1.5.1/bin/sparkR --packages com.databricks:spark-csv_2.10:1.0.3

library(SparkR)

setwd('/DATA/')

Sys.setenv('SPARKR_SUBMIT_ARGS'='"--packages" "com.databricks:spark-csv_2.10:1.2.0" "sparkr-shell"')

sqlContext <- sparkRSQL.init(sc)

df <- read.df(sqlContext, "/DATA/GSE45291/GSE45291.csv", source = "com.databricks.spark.csv", inferSchema = "true")

results <- cor(as.data.matrix(df), type="pearson")

data.matrix(df)Error in as.vector(data) : no method for coercing this S4 class to a vector

Is there no built-in correlation function for SparkR? How can I fix the S4 object to work in R where I can perform base functions? Any suggestions folks have is appreciated. Thanks -Rich

Running correlations in SparkR: no method for coercing this S4 class to a vector

Answers (1)

Related Questions