Reputation: 328
I am new to Spark and Scala as well, so this might be a very basic question.
I created a text file with 4 lines of some words. The rest of the code is as below:
val data = sc.textFile("file:///home//test.txt").map(x=> x.split(" "))
println(data.collect)
println(data.take(2))
println(data.collect.foreach(println))
All the above "println" commands are producing output as: [Ljava.lang.String;@1ebec410
Any idea how do I display the actual contents of the rdd, I have even tried "saveAstextfile", it also save the same line as java...
I am using Intellij IDE for spark scala and yes, I have gone through other posts related to this, but no help. Thanking you in advance
Upvotes: 1
Views: 4002
Reputation: 2294
The final return type of RDD is RDD[Array[String]]
Previously you were printing the Array[String]
that prints something like this [Ljava.lang.String;@1ebec410)
Because the toString()
method of Array is not overridden so it is just printing the HASHCODE
of object
You can try casting Array[String]
to List[String]
by using implicit method toList
now you will be able to see the content inside the list because toString()
method of list in scala in overridden and shows the content
That Means if you try
data.collect.foreach(arr => println(arr.toList))
this will show you the content or as @Raphael has suggested
data.collect().foreach(arr => println(arr.mkString(", ")))
this will also work because arr.mkString(", ")
will convert the array into String and Each element Seperated by ,
Hope this clears you doubt Thanks
Upvotes: 2
Reputation: 27383
data
is of type RDD[Array[String]]
, what you print is the toString
of the Array[String]
( [Ljava.lang.String;@1ebec410
), try this:
data.collect().foreach(arr => println(arr.mkString(", ")))
Upvotes: 0