David Rebe Garcia
David Rebe Garcia

Reputation: 445

How I know the runtime of a code in scala?

I need to calculate the runtime of a code in scala. The code is.

val data = sc.textFile("/home/david/Desktop/Datos Entrada/household/household90Parseado.txt")

val parsedData = data.map(s => Vectors.dense(s.split(' ').map(_.toDouble))).cache()

val numClusters = 5
val numIterations = 10 
val clusters = KMeans.train(parsedData, numClusters, numIterations)

I need to know the runtime to process this code, the time have to be on seconds.

Upvotes: 29

Views: 40199

Answers (6)

Ram Ghadiyaram
Ram Ghadiyaram

Reputation: 29195

  • Case : Before spark 2.1.0

< Spark 2.1.0 explicitly you can use this function in your code to measure time in milli seconds

/**
   * Executes some code block and prints to stdout the time taken to execute the block. This is
   * available in Scala only and is used primarily for interactive testing and debugging.
   *
   */
  def time[T](f: => T): T = {
    val start = System.nanoTime()
    val ret = f
    val end = System.nanoTime()
     println(s"Time taken: ${(end - start) / 1000 / 1000} ms")
     ret
  }

Usage :

  time {
    Seq("1", "2").toDS().count()
  }
//Time taken: 3104 ms
  • Case : After spark 2.1.0

>= Spark 2.1.0 There is a built in function given in SparkSession

you can use spark.time

Usage :

  spark.time {
    Seq("1", "2").toDS().count()
  }
//Time taken: 3104 ms

Upvotes: 7

notNull
notNull

Reputation: 31520

Starting from Spark2+ we can use spark.time(<command>)(only in scala until now) to get the time taken to execute the action/transformation..

Example:

Finding count of records in a dataframe

scala> spark.time(
                 sc.parallelize(Seq("foo","bar")).toDF().count() //create df and count
                 )
Time taken: 54 ms //total time for the execution
res76: Long = 2  //count of records

Upvotes: 12

evan.oman
evan.oman

Reputation: 5572

Based on discussion here, you'll want to use System.nanoTime to measure the elapsed time difference:

val t1 = System.nanoTime

/* your code */

val duration = (System.nanoTime - t1) / 1e9d

Upvotes: 68

Sandish Kumar H N
Sandish Kumar H N

Reputation: 312

this would be the best way to do calculate time for scala code.

def time[R](block: => (String, R)): R = {
    val t0 = System.currentTimeMillis()
    val result = block._2
    val t1 = System.currentTimeMillis()
    println(block._1 + " took Elapsed time of " + (t1 - t0) + " Millis")
    result
 }

 result = kuduMetrics.time {
    ("name for metric", your function call or your code)
 }

Upvotes: 1

fr3ak
fr3ak

Reputation: 503

You can use scalameter: https://scalameter.github.io/

Just put your block of code in the brackets:

val executionTime = measure {
  //code goes here
}

You can configure it to warm-up the jvm so the measurements will be more reliable:

val executionTime = withWarmer(new Warmer.Default) measure {
  //code goes here
}

Upvotes: 6

Larsenal
Larsenal

Reputation: 51186

The most basic approach would be to simply record the start time and end time, and do subtraction.

val startTimeMillis = System.currentTimeMillis()

/* your code goes here */

val endTimeMillis = System.currentTimeMillis()
val durationSeconds = (endTimeMillis - startTimeMillis) / 1000

Upvotes: 12

Related Questions