Reputation: 457
I'm trying to read parquet files in SparkR that is installed in Windows. When I issue the following command all_tweets <- collect(read.parquet(sqlContext,"hdfs://localhost:9000/orcladv/internet/rawtweets"))
I get an error Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) : java.lang.AssertionError: assertion failed: No predefined schema found, and no Parquet data files or summary files found under file:/C:/Users/xxxxx/Documents/hdfs:/localhost:9000/orcladv/internet/rawtweets.
at scala.Predef$.assert(Predef.scala:179)
I'm not sure why it prefixes file:/C:/Users as it is a hdfs://localhost:9000
Please help..
Thanks
Bala
Upvotes: 1
Views: 157
Reputation: 480
Does this post help? It seems related, and offers some clues on how to find the correct hdfs path.
Change "localhost" to the value in fs.defaultFS in your core-site.xml file.
If the hdfs path is invalid then Spark seems to assume that it must look in the local file system.
Upvotes: 0