Alternative to explicit for loop for setting matrix entries based on column indexes

Question

How can I achieve the same as below without using a for-loop?

df1 = data.frame( val = c("a", "c", "c", "b", "e") )  

m1 = matrix(0, nrow=nrow(df1), ncol=length( c("a", "b", "c", "d", "e") ) )
colnames(m1) = c("a", "b", "c", "d", "e")

for(i in 1:nrow(df1)){
  m1[i, df1[i, 1] ] = 1  #For each entry in dataframe, mark the respective column as 1
}

A. Webb · Accepted Answer

This

f<-function(m1,df) {
  for(i in 1:nrow(df1))
    m1[i, df1[i, 1] ] = 1
  return(m1)
}

is equivalent to

g<-function(m1,df) {
  m1[cbind(seq_len(nrow(df)),df1[,1])]<-1
  return(m1)
}

The latter is faster for this particular example

> microbenchmark(f(m1,df1),g(m1,df1))
Unit: microseconds
       expr     min      lq      mean  median      uq     max neval cld
 f(m1, df1) 167.085 174.885 194.58999 185.969 200.132 342.379   100   b
 g(m1, df1)  20.116  22.990  27.12403  24.222  27.300 158.053   100  a

Note, however,

both are utilizing the factor levels rather than character column names
you should code what is clearest rather than what is fastest unless and until you identify a true bottleneck

Alternative to explicit for loop for setting matrix entries based on column indexes

Answers (2)

Related Questions