Reputation: 680
In R I have created two datasets which I have saved as csv-files by
liste <-write.csv(liste, file="/home/.../liste.csv", row.names=FALSE)
data <- write.csv(data, file="/home/.../data.csv", row.names=FALSE)
I now want to open these csv files in SparkR. So I type
liste <- read.df(sqlContext, "/home/.../liste.csv", "com.databricks.spark.csv", header="true", delimiter= "\t")
data <- read.df(sqlContext, "/home/.../data.csv", "com.databricks.spark.csv", header="true", delimiter= "\t")
It turns out that the one dataset 'liste' is loaded successfully in SparkR, however, 'data' cannot be loaded for some strange reasons.
'liste' is just a vector of numbers in R whereas 'data' is a data.frame I have loaded in R and removed some parts of the data.frame. SparkR gives me this error-message:
Error: returnStatus == 0 is not TRUE
Upvotes: 3
Views: 2004
Reputation: 1690
Liste is a local list which can be written with write.csv, data is a SparkR DataFrame which can't be written with write.csv: it only writes its pointer, not the DataFrame. That's why it only is 33 kb
Upvotes: 2