Reputation: 1932
I've been trying to increase the speed of some code. I've removed all loops, am using vectors and have streamed lined just about everything. I've timed each iteration of my code and it appears to be slowing as iterations increase.
### The beginning iterations
user system elapsed
0.03 0.00 0.03
user system elapsed
0.03 0.00 0.04
user system elapsed
0.03 0.00 0.03
user system elapsed
0.04 0.00 0.05
### The ending iterations
user system elapsed
3.06 0.08 3.14
user system elapsed
3.10 0.05 3.15
user system elapsed
3.08 0.06 3.15
user system elapsed
3.30 0.06 3.37
I have 598 iterations and right now it takes about 10 minutes. I'd like to speed things up. Here's how my code looks. You'll need the RColorBrewer
and fields
packages. Here's my data. Yes I know its big, make sure you download the zip file.
StreamFlux <- function(data,NoR,NTS){
###Read in data to display points###
WLX = c(8,19,29,20,13,20,21)
WLY = c(25,28,25,21,17,14,12)
WLY = 34 - WLY
WLX = WLX / 44
WLY = WLY / 33
timedata = NULL
mf <- function(i){
b = (NoR+8) * (i-1) + 8
###I read in data one section at a time to avoid headers
mydata = read.table(data,skip=b,nrows=NoR, header=FALSE)
rows = 34-mydata[,2]
cols = 45-mydata[,3]
flows = mydata[,7]
rows = as.numeric(rows)
cols = as.numeric(cols)
rm(mydata)
###Create Flux matrix
flow_mat <- matrix(0,44,33)
###Populate matrix###
flow_mat[(rows - 1) * 44 + (45-cols)] <- flows+flow_mat[(rows - 1) * 44 + (45-cols)]
flow_mat[flow_mat == 0] <- NA
rm(flows)
rm(rows)
rm(cols)
timestep = i
###Specifying jpeg info###
jpeg(paste("Steamflow", timestep, ".jpg",sep = ''),
width = 640, height=441,quality=75,bg="grey")
image.plot(flow_mat, zlim=c(-1,1),
col=brewer.pal(11, "RdBu"),yaxt="n",
xaxt="n", main=paste("Stress Period ",
timestep, sep = ""))
points(WLX,WLY)
dev.off()
rm(flow_mat)
}
ST<- function(x){functiontime=system.time(mf(x))
print(functiontime)}
lapply(1:NTS, ST)
}
###To run all timesteps###
StreamFlux("stream_out.txt",687,598)
###To run the first 100 timesteps###
StreamFlux("stream_out.txt",687,100)
###The first 200 timesteps###
StreamFlux("stream_out.txt",687,200)
print(functiontime)
to stop it printing at every timestep then> system.time(StreamFlux("stream_out.txt",687,100))
user system elapsed
28.22 1.06 32.67
> system.time(StreamFlux("stream_out.txt",687,200))
user system elapsed
102.61 2.98 106.20
What I'm looking for is anyway to speed up running this code and possibly an explanation of why it is slowing down? Should I just run it in parts, seems a stupid solution. I've read about dlply
from the plyr
. It seems to have worked here but would that help in my case? How about parallel processing, I think I can figure that out but is it worth the trouble in this case?
Upvotes: 0
Views: 439
Reputation: 8717
I will follow @PaulHiemstra's suggestion and post my comment as an answer. Who can resist Internet points? ;)
From a quick glance at your code, I agree with @joran's second point in his comment: your loop/function is probably slowing down due to repeatedly reading in your data. More specifically, this part of the code probably needs to be fixed:
read.table(data, skip=b, nrows=NoR, header=FALSE)
.
In particular, I think the skip=b
argument is the culprit. You should read in all the data at the beginning, if possible, and then retrieve the necessary parts from memory for the calculations.
Upvotes: 3