Reputation: 197
I have just started using Spark and my interactions with it revolve around spark-shell
at the moment. I would like to benchmark how long various commands take, but could not find how to get the time or run a benchmark. Ideally I would want to do something super-simple, such as:
val t = [current_time]
data.map(etc).distinct().reduceByKey(_ + _)
println([current time] - t)
Edit: Figured it out --
import org.joda.time._
val t_start = DateTime.now()
[[do stuff]]
val t_end = DateTime.now()
new Period(t_start, t_end).toStandardSeconds()
Upvotes: 1
Views: 999
Reputation: 40380
I suggest you do the following :
def time[A](f: => A) = {
val s = System.nanoTime
val ret = f
println("time: " + (System.nanoTime - s) / 1e9 + " seconds")
ret
}
You can pass a function as an argument to time function and it will compute the result of the function giving you the time taken by the function to be performed.
Let's consider a function foobar
that take data as argument and then do the following :
val test = time(foobar(data))
test
will contains the result of foobar
and you'll get the time needed as well.
Upvotes: 3