Reputation: 1293
I'm using AWS EMR at work. If I launch a spark shell I can run scala commands but can't read in a local file.
For example:
scala> val citi = spark.read.textFile("CitiGroup2006")
org.apache.spark.sql.AnalysisException: Path does not exist: hdfs://ip-10-99-99-99.ec2.internal:8020/user/hadoop/CitiGroup2006;
I tried entering the full path of the file but I get the same error. The file is in the same directory where I launched the spark shell. It does however work to load a scala file
:load hello.scala
Why does "load" work but not spark.read.textFile?
Upvotes: 1
Views: 1441
Reputation: 116
not so strong on scala.
but its look like spark.read.file
read from the HDFS and I guess that your file is on the EMR local.
you can see files on the HDFS using the command:
$ hdfs dfs -ls
and copy files using the -put
check out hadoop copy a local file system folder to HDFS
and hadoop-common/FileSystemShell
Upvotes: 2