Arhopala
Arhopala

Reputation: 366

Applying a function to a multidimensional array with grouping variable

I have what I thought would be a simple problem, but I haven't been able to find an appropriate answer. I have a multidimensional array v[x,y,z] and I would like to apply a function to the array along the z dimension using a grouping variable (group). Here is an example (in R):

v<-1:81
dim(v)<-c(3,3,9)
group<-c('a','a','a','b','b','b','c','c','c')

Given that the grouping variable has 3 levels (a, b and c), the result (out) I'm looking for is an array of dimension 3x3x3. I can obtain out using the following code for the above example:

out1<-apply(v[,,c(1:3)],c(1,2),sum)
out2<-apply(v[,,c(4:6)],c(1,2),sum)
out3<-apply(v[,,c(7:9)],c(1,2),sum)

library(abind)
out<-abind(out1, out2, out3, along=3) 

My question is if there is a a general means of obtaining the above result, which can be applied to large dimensional arrays and long grouping vectors.

Upvotes: 9

Views: 1902

Answers (4)

GKi
GKi

Reputation: 39747

split by the groups and go over the groups with lapply. Use the index to subset the array and use sum in apply. Simplify the list to an array with simplify2array.

x <- simplify2array( lapply(split(seq_along(group), group), \(i)
                    apply(v[,,i], 1:2, sum)) )
all.equal(x, out, check.attributes = FALSE)
#[1] TRUE

In this case rowSums could also be used.

x <- simplify2array( lapply(split(seq_along(group), group), \(i)
                    rowSums(v[,,i], dim=2)) )

Another way would be to use tapply inside apply where the order of the dimensions need to be reordered with aperm

x <- apply(v, 1:2, tapply, group, sum)
all.equal(aperm(x, c(2,3,1)), out, check.attributes = FALSE)
#[1] TRUE

Upvotes: 0

Simon O&#39;Hanlon
Simon O&#39;Hanlon

Reputation: 60000

Using the package raster might be more appropriate for your needs. It has some code optimised for handling remotely sensed data, taking care of processing in chunks. Consider this example:

## Make 12 rasters, maybe one for each month of the year
for( i in seq(12) ){
    assign( paste0( "r" , i ) , raster( matrix(runif(1e3) , nrow = 1e2 ) ) )
}

## Create a raster stack from these
rS <- stack( mget( paste0("r",1:12) , envir = .GlobalEnv ) )

## Use calc to get mean, using by to group by a variable
## In this example I use the vector (1,1,1,2,2,2,3,3,3,4,4,4)
## meaning I get means for the first 3 rasters, then the next 3 etc
## So I get a mean for each quarter
rMean <- calc( rS , fun = function(x){ by(x , c( rep( 1:4 , each=3 ) ) , mean ) }  )

Which returns a raster brick with 4 layers (one mean for each quarter):

class       : RasterBrick 
dimensions  : 100, 10, 1000, 4  (nrow, ncol, ncell, nlayers)
resolution  : 0.1, 0.01  (x, y)
extent      : 0, 1, 0, 1  (xmin, xmax, ymin, ymax)
coord. ref. : NA 
data source : in memory
names       :         X1,         X2,         X3,         X4 
min values  : 0.02096586, 0.04015260, 0.04704145, 0.05884161 
max values  :  0.9727491,  0.9303025,  0.9804486,  0.9934670

I hope you can adapt this to your data.

Upvotes: 5

flodel
flodel

Reputation: 89107

Easy:

out <- apply(v, c(1, 2), by, group, sum)

But to get the data in exactly the same order as you want:

out <- aperm(apply(v, c(1, 2), by, group, sum), c(2, 3, 1))

Upvotes: 8

krlmlr
krlmlr

Reputation: 25484

This is much easier if your data is formatted as data frame:

library(plyr)
vd <- adply(v, 1:3)
head(vd)

  X1 X2 X3 V1
1  1  1  1  1
2  2  1  1  2
3  3  1  1  3
4  1  2  1  4
5  2  2  1  5
6  3  2  1  6

Then, you can simply attach your grouping...

vd$group <- rep(group, rep(3 * 3, length(group)))

...and split according to this grouping:

daply(vd, .(group), function(df) { ... } )

The anonymous function { ... } will be called once for each group, with df containing the sub-dataframe corresponding to that group. Here you could recombine and aggregate the data into a matrix using similar machinery. The function should return an array of dimensions 3x3x1, these will be concatenated by daply to form the desired result.

Upvotes: 2

Related Questions