generic_user
generic_user

Reputation: 3562

Advice on plyr-ing this 'for' loop

Try as I might, I can't quite figure out how to get plyr to work. Would appreciate any help with this specific example, and bonus points for any explanations of why your example works.

The data is here, in case anyone wants to make this example "workable"

Setup:

library(ncdf)
library(plyr)
longs = seq.int(from=1,to=4320,by=6)
lats = seq.int(from=1,to=2160,by=6)
blocksize = 6
maize.nc = open.ncdf('maize_5min.nc')

This for loop works, but is slow and not parallel:

latlongcoords = NULL
for (x in 1:length(longs))  {
for (y in 1:length(lats))   {   
    lat = mean(get.var.ncdf(maize.nc,varid="latitude",start = lats[x],count = blocksize))
    lon = mean(get.var.ncdf(maize.nc,varid="longitude",start = longs[y],count = blocksize))
    rw = c(lat,lon)
    latlongcoords = rbind(latlongcoords,rw)
    }
    #print(x)
    }
save(latlongcoords,file="latlongcoords")

I want to do something like this:

require(doMC)
registerDoMC(32)
x = seq(1:10)  #shorten it for testing purposes
y = seq(1:10)  #shorten it for testing purposes
makecoords = function(x,y){
    lat = mean(get.var.ncdf(maize.nc,varid="latitude",start = lats[x],count = blocksize))
    lon = mean(get.var.ncdf(maize.nc,varid="longitude",start = longs[y],count = blocksize))
    c(lat,lon)
    }
latlongcoords = NULL
latlongcoords = aaply(.data = cbind(x,y), .margins=2, .fun=makecoords(x,y),.parallel=TRUE)

When I run it, I get this error message:

Error in get.var.ncdf(maize.nc, varid = "latitude", start = lats[x], count = blocksize) : 
  Error: variable has 1 dims, but start has 10 entries.  They must match!

It looks like plyr is passing the whole vector to the function, not the individual values! Advice for how to make this work, and explanations of why your fix works, are really appreciated!

Thanks in advance!

Upvotes: 0

Views: 423

Answers (1)

Arun
Arun

Reputation: 118859

Try this. I think you've to pass .margin=1 instead of 2 here. That is, if you want to pass the values by row, then .margin = 1. x[1] and y[1] are first passed to w. So, we'll access them using w[1] and w[2] and pass it to your makecoords function. I hope this is what you expect. If not, feel free to write under comments as to what's going wrong.

latlongcoords <- aaply(cbind(x,y), 1, function(w) 
                 makecoords(w[1], w[2]), .parallel=TRUE)

It works fine with .parallel = FALSE for me. Can't test .parallel=TRUE now.

head(latlongcoords)
# X1              1            2
#   1    89.7500025 -179.7499949
#   2    86.7500025 -176.7499949
#   3    83.7500025 -173.7499949
#   4    80.7500025 -170.7499949
#   5    77.7500025 -167.7499949
#   6    74.7500025 -164.7499949

Upvotes: 1

Related Questions