Reputation: 2666
Say I have a data set that describes the abundance of different species, at different sites, d1
:
site <- c(1:5)
species1 <- c('A','A','B','C','A')
abundance1<- c(0.11,0.45,0.87,1.00,0.23)
species2 <- c('B','C','A','A','C')
abundance2 <- 1 - abundance1
d1<- data.frame(site,species1,abundance1,species2,abundance2)
So, each site has two species, and there is an abundance
column that describes the proportion of the total community each species represents.
I then have a second data set, d2
, that describes some trait measurement of each species within a plot, for instance weight
. So, species A in plot 1 may have a different observation of weight
than species A in plot 2. The dataframe, d2
, looks like this:
site<- c(1,1,2,2,3,3,4,4,5,5)
species <- c('A','B','A','C','B','A','C','A','A','C')
weight <- rnorm(10, 50,4)
d2<- data.frame(site,species,weight)
I would like to generate a column within d1
that is the abundance weighted average of weight
, using the weight
data in d2
such that each species within plot is assigned their unique observation of weight
in the final calculation.
The expected output for the first entry of the new calculated vector would be the output of the function:
d1[1,3]*d2[1,3] + d1[1,5]*d2[2,3]
Upvotes: 0
Views: 60
Reputation: 376
Old school R. May be an easier way with other packages but this is straightforward apply
.
d1$newvec <- apply(d1, 1, function(x)
d2[d2$site==x[1]&d2$species==x[2],'weight']*as.numeric(x[3]) +
d2[d2$site==x[1]&d2$species==x[4],'weight']*as.numeric(x[5]))
Upvotes: 1