Silver Su
Silver Su

Reputation: 1

I don't get any result from notebook in Bluemix Spark

I tried to execute my scala code in Bluemix Spark service, once I can run it and get right result from my local virtual machine. When I ran it in Bluemix Spark, I can not get any response in notebook.

import org.apache.spark.mllib.linalg.{Vector, Vectors}
import org.apache.spark.mllib.linalg.distributed.RowMatrix
import org.apache.spark.mllib.linalg.Matrix
val input = sc.textFile("swift://notebooks.spark/pca.csv")
val header = input.first()
val inputData = input.filter(x => x != header).map(line=>line.split(','))
val inputVector = input.map{d=>
  Vectors.dense(
    d(1).toDouble, d(2).toDouble, d(3).toDouble, d(4).toDouble, d(5).toDouble, d(6).toDouble,
    d(7).toDouble, d(8).toDouble, d(9).toDouble, d(10).toDouble, d(11).toDouble)}
val rowMatrix = new RowMatrix(inputVector)
val pca: Matrix = rowMatrix.computePrincipalComponents(5)

When I execute the intput.take(2), I can get result well but no result for executing input.foreach(println). It's strange. How can I get result?

Upvotes: 0

Views: 150

Answers (1)

Sven Hafeneger
Sven Hafeneger

Reputation: 801

I have tested it on Bluemix in a Scala notebook.

val input = sc.textFile("swift://notebooks.spark/test.csv")
input.take(1) /** shows the first line */
input.foreach(println) /** nothing is displayed */

If you want to display the content of a RDD, then you can use the following code.

input.take(5).foreach(println) /** shows the first 5 lines */
input.collect().foreach(println) /** shows all lines */

I do not know how your local VM is set up, but I think you have to distinguish between running your code local or on a cluster.

Have look at this answer for more information: How to print the contents of RDD?

Upvotes: 2

Related Questions