zainul abid
zainul abid

Reputation: 85

CPU Usage Measurement and Calculation in R using proc.time()

How can I calculate CPU usage of a function written in R in windows operating system. Can I calculate the CPU Usage using proc.time() function

I tried the following way

fibb <- function (n) {
  if (n < 3) {
    return(c(0,1)[n])
  } else {
    return(fibb(n - 2) + fibb(n -1))
  }
}

t2 <- proc.time()
fibb(30)
print("Time two")
x=proc.time() - t2
cpu_usage<-(as.numeric(x)[[2]]/as.numeric(x)[[1]])*100 # system time/user time
print(paste("Cpu usage:", round(cpu_usage,2)))

Is it correct way ? else could you please help me with this?

Upvotes: 0

Views: 669

Answers (1)

Zheyuan Li
Zheyuan Li

Reputation: 73385

t.start <- proc.time()
## <some R code>
t.end <- proc.time()
x <- t.end - t.start

I think the following are good measures of CPU usage:

## user / (user + system)
x[[1]] / (x[[1]] + x[[2]]) * 100 * m

## user / elapsed
x[[1]] / x[[3]] * 100

where m is the number of CPUs used in <some R code>.

  • If m = 1, i.e., in serial computing, they should give close results;

  • If m > 1, i.e., in parallel computing, they may differ (a lot) and I tend to trust the first one.


example with m = 1

Using OP's fibb() function to compute a Fibonacci number.

t.start <- proc.time()
fibb(30)
t.end <- proc.time()
x <- t.end - t.start

x[[1]] / (x[[1]] + x[[2]]) * 100
#[1] 99.3689

x[[1]] / x[[3]] * 100
#[1] 99.19817

R-level parallel computing example with m = 2

Matrix computation without optimized BLAS.

library(parallel)
cl <- makeCluster(2)
t.start <- proc.time()
foo <- clusterApply(cl, c(4000, 4000), fun = function (n) crossprod(matrix(rnorm(n * n), n)))
t.end <- proc.time()
x <- t.end - t.start
stopCluster(cl)

x
#  user  system elapsed 
# 0.723   0.375   6.324

x[[1]] / (x[[1]] + x[[2]]) * 100 * 2
#[1] 131.694

x[[1]] / x[[3]] * 100
#[1] 11.43264

Hmm?? Well, no surprising. Jobs are actually done on 2 other R processes, not in our working R session. CPU is basically idle for our working R session. I don't know how to get the right CPU usage. (Maybe do proc.time() in the function that is passed to clusterApply so we measure CPU usage for each slave R process separately?)

FORTRAN/C/C++ parallel computing example with m = 2

Matrix computation with OpenBLAS and 2 threads.

A <- matrix(rnorm(4000 * 4000), 4000)
t.start <- proc.time()
AA <- A %*% A
t.end <- proc.time()
x <- t.end - t.start

x[[1]] / (x[[1]] + x[[2]]) * 100 * 2
#[1] 197.1904

x[[1]] / x[[3]] * 100
#[1] 195.1596

Note:

If you have a big chunk of code that mixes serial and parallel computing, I advise breaking it into smaller chunks:

t.start <- proc.time()
## <R code chunk 1 - serial computing>
t.end <- proc.time()
x1 <- t.end - t.start
CPU.usage1 <- x[[1]] / (x[[1]] + x[[2]]) * 100

t.start <- proc.time()
## <R code chunk 2 - parallel computing with m CPUs>
t.end <- proc.time()
x2 <- t.end - t.start
CPU.usage2 <- x[[1]] / (x[[1]] + x[[2]]) * 100 * m

t.start <- proc.time()
## <R code chunk 3 - serial computing>
t.end <- proc.time()
x3 <- t.end - t.start
CPU.usage3 <- x[[1]] / (x[[1]] + x[[2]]) * 100

Upvotes: 1

Related Questions