apply problem giving each element a different name

Question

I need to optimize a small piece of code. The code can be simplified as following. Let's say I have two data frame, I want to obtain a "result" data frame that is a selection of data2 with some conditions. For each line I need to add an identifier that corresponds to the line of the first data frame. This identifier is added to the resulting data frame as a column called "identity".

data=data.frame(a=sample(1:100, 100, replace=TRUE),b=sample(1:100, 100, replace=TRUE) )
data2=data.frame(a=sample(1:100, 100, replace=TRUE),b=sample(1:100, 100, replace=TRUE) )

result=NULL
for(i in 1:nrow(data)){  # I loop on each row of "data"
  # if the difference between the current row and the column "a"
  # of "data2" is bigger than zero we store the values of data2
  boolvect=data[i,"a"]-data2$a>0
  ares=data2[ boolvect,]
  if(nrow(ares)>0){
    # we add an identifier for such event, the identifier is the
    # row number of "data"
    ares$identity=i
    result=rbind(result,ares)
  }
}

I tried to use apply with margin 1. The results are the same but I don't know how to properly deal with the "identity" column.

all_df=apply(data, 1, function(x, data2){
  val=as.numeric(x["a"])
  boolvect=val-data2$a>0
  return(data2[boolvect,])
  
}, data2=data2)

result2=do.call(rbind, all_df)

Any help please?

Ronak Shah · Accepted Answer

To get the identity column we need to iterate over the index of data.

You can do this using lapply or Map.

result1 <- do.call(rbind, lapply(seq_along(data$a), function(i) {
  boolvect= data$a[i] - data2$a > 0
  if(any(boolvect)) transform(data2[boolvect, ], identity = i)
}))

With Map :

result2 <- do.call(rbind, Map(function(x, y) {
  boolvect = x - data2$a > 0
  if(any(boolvect)) transform(data2[boolvect, ], identity = y)
}, data$a, 1:nrow(data)))

apply problem giving each element a different name

Answers (2)

Related Questions