Ole Petersen
Ole Petersen

Reputation: 680

Loading csv-files in sparkR

In R I have created two datasets which I have saved as csv-files by

liste <-write.csv(liste, file="/home/.../liste.csv", row.names=FALSE)
    data <- write.csv(data, file="/home/.../data.csv", row.names=FALSE)

I now want to open these csv files in SparkR. So I type

liste <- read.df(sqlContext, "/home/.../liste.csv", "com.databricks.spark.csv", header="true", delimiter= "\t")

data <- read.df(sqlContext, "/home/.../data.csv", "com.databricks.spark.csv", header="true", delimiter= "\t")

It turns out that the one dataset 'liste' is loaded successfully in SparkR, however, 'data' cannot be loaded for some strange reasons.

'liste' is just a vector of numbers in R whereas 'data' is a data.frame I have loaded in R and removed some parts of the data.frame. SparkR gives me this error-message:

Error: returnStatus == 0 is not TRUE

Upvotes: 3

Views: 2004

Answers (1)

Wannes Rosiers
Wannes Rosiers

Reputation: 1690

Liste is a local list which can be written with write.csv, data is a SparkR DataFrame which can't be written with write.csv: it only writes its pointer, not the DataFrame. That's why it only is 33 kb

Upvotes: 2

Related Questions