Balaji Krishnan
Balaji Krishnan

Reputation: 457

SparkR in Windows

I'm trying to read parquet files in SparkR that is installed in Windows. When I issue the following command all_tweets <- collect(read.parquet(sqlContext,"hdfs://localhost:9000/orcladv/internet/rawtweets"))

I get an error Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) : java.lang.AssertionError: assertion failed: No predefined schema found, and no Parquet data files or summary files found under file:/C:/Users/xxxxx/Documents/hdfs:/localhost:9000/orcladv/internet/rawtweets.

    at scala.Predef$.assert(Predef.scala:179)

I'm not sure why it prefixes file:/C:/Users as it is a hdfs://localhost:9000

Please help..

Thanks

Bala

Upvotes: 1

Views: 157

Answers (1)

SpiritusPrana
SpiritusPrana

Reputation: 480

Does this post help? It seems related, and offers some clues on how to find the correct hdfs path.

Change "localhost" to the value in fs.defaultFS in your core-site.xml file.

If the hdfs path is invalid then Spark seems to assume that it must look in the local file system.

Upvotes: 0

Related Questions