Koushik Khan
Koushik Khan

Reputation: 179

Read a .csv file with Sparklyr in R

I have couple of .csv files in C:\Users\USER_NAME\Documents which are more than 2 GB in size. I want to use Apache Spark to read the data out of them in R. I am using Microsoft R Open 3.3.1 with Spark 2.0.1.

I am stuck with reading the .csv files with the function spark_read_csv(...) defined in Sparklyr package. It is asking for a file path which starts with file://. I want to know the proper file path for my case starting with file:// and ends with the file name which are in .../Documents directory.

Upvotes: 1

Views: 1012

Answers (1)

Felix
Felix

Reputation: 309

I had a similar problem. In my case it was necessary for the .csv file to be put into the hdfs file system before calling it with spark_read_csv.

I think you probably have a similar problem.

If your cluster is also running with hdfs you need to use:

hdfs dfs -put

Best, Felix

Upvotes: 1

Related Questions