Ben
Ben

Reputation: 21695

How do I make this R code run faster

Goal: Generate a set of rectangular boxes and stack them onto a 10x10 grid. My function getCoordinates generates numBoxes randomly sized boxes with integer lengths between 1 and 10. The variable gridTops keeps track of the max height occupied by each cell of the 10x10 grid. The function returns a two element list containing a matrix with coordinates of the stacked boxes and gridTops.

getCoordinates<-function(numBoxes){
  # Generates numBoxes random sized boxes with integer dimensions between 1 and 10.
  # Stacks them and returns results
  boxes<-data.frame(boxId=1:numBoxes,
                    xDim=sample(10,numBoxes, replace=TRUE),
                    yDim=sample(10,numBoxes, replace=TRUE),
                    zDim=sample(10,numBoxes, replace=TRUE))
  gridTops<-matrix(0,nrow=10,ncol=10)
  coordinates<-matrix(nrow=nrow(boxes),
                      ncol=6,
                      dimnames=list(boxes$boxId,c("x1","y1","z1","x8","y8","z8")))
  for(i in 1:nrow(boxes)){
    mylist<-addBox(boxes[i,], gridTops);
    coordinates[i,]<-mylist[["coordinates"]];
    gridTops<-mylist[["gridTops"]];
  }
  return(list(boxes=boxes, coordinates=coordinates));
}

addBox<-function(boxDimensions, gridTops){
  #Returns a list of coordinates and the updated gridTops matrix
  xd<-boxDimensions$xDim
  yd<-boxDimensions$yDim
  zd<-boxDimensions$zDim
  x1<-0
  y1<-0
  z1<-max(gridTops[1:xd,1:yd])
  gridTops[1:xd,1:yd]<-(z1+zd)
  coordinates<-c(x1,y1,z1,x1+xd,y1+yd,z1+zd)
  return(list(coordinates=coordinates,gridTops=gridTops))
}

As an example,

test<-getCoordinates(5)
test[["boxes"]]
  boxId xDim yDim zDim
1     1    5    2    4
2     2    9    1    4
3     3    1    7    7
4     4   10    6    1
5     5    5    8   10
test[["coordinates"]]
  x1 y1 z1 x2 y2 z2
1  0  0  0  5  2  4
2  0  0  4  9  1  8
3  0  0  8  1  7 15
4  0  0 15 10  6 16
5  0  0 16  5  8 26

As you can see, my method of stacking the boxes is just putting one on top another with one corner on the (x=0,y=0) cell. Simple enough, but it's taking a long time to stack 10000+ boxes. For example

system.time(getCoordinates(10000))
 user  system elapsed 
2.755   0.414   3.169

I think my for loop is slowing things down, but I don't know how to apply an apply function in this situation. How can I speed this thing up?

Edit: The method addBox is subject to change. As I mentioned, it simply stacks one box right on top the next. This is a naive packing algorithm, but I wrote it for illustrative purposes.

Upvotes: 0

Views: 166

Answers (1)

Neal Fultz
Neal Fultz

Reputation: 9696

Changing boxes from a data.frame to a matrix speeds it up considerably for me.

I changed

boxes<-data.frame(

to

boxes <- cbind(

and edited the places you accessed boxes in the two functions, went from :

R>system.time(getCoordinates(10000))
   user  system elapsed 
  1.926   0.000   1.941 
R>getCoordinates <- edit(getCoordinates)
R>addBox <- edit(addBox)
R>system.time(getCoordinates(10000))
   user  system elapsed 
  0.356   0.002   0.362 

Upvotes: 1

Related Questions