Reputation: 3941

Calculate mean of rows when matching separate dataframe

Given a data frame with index and data columns like so:

AIndex <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15)
AData  <- c(3,5,6,7,3,2,1,2,3,4,5,6,7,8,9)
DF <- data.frame(AIndex,AData)

And given a second data frame with some overlap in the index like so:

BIndex <- c(1,4,8,11,13)
BData  <- c(3,5,7,6,5)
DF2 <- data.frame(BIndex,BData)

My goal is to be able to find where the Index in A and the Index in B match up in the A data frame, and then calculate the mean of that row in A + 2 rows.

For example, the first time A and B match up in A is row# 1. So I would want to find the corresponding data point in A (3) and the next two data points (5 and 6). So the mean would be 4.6

The final result would be a new data frame that looked like this:

Upvotes: 2

Answers (3)

Colonel Beauvel

Reputation: 31181

You can do this using data.table package:

library(data.table)

setDT(DF2)[,mean(DF[BIndex:(BIndex+2),'AData']),BIndex]
#   BIndex       V1
#1:      1 4.666667
#2:      4 4.000000
#3:      8 3.000000
#4:     11 6.000000
#5:     13 8.000000

Upvotes: 2

Phomthep S.

Reputation: 21

I am new to R and this community. I try to follow @csgillespie's step. the result turns out to be this:

> DF2 <- data.frame(BIndex, BData)  
> newInd <- merge(dat, DF2, by.x="AIndex", by.y="BIndex", all.y=T)  
> newInd$newCM <- (AData[mats] + AData[mats+1] + AData[mats+2]) / 3
> newInd

  AIndex AData BData    newCM  
1      1     3     3 4.666667  
2      4     7     5 4.000000  
3      8     2     7 3.000000  
4     11     5     6 6.000000  
5     13     7     5 8.000000

Three thanks!

Upvotes: 2

csgillespie

Reputation: 60492

There are a few ways of doing this. The first will step usually involves finding where the elements match:

mats = match(DF2$BIndex, DF$AIndex)

To find the means, just add up the relevant values

(AData[mats] + AData[mats+1] + AData[mats+2])/3

Upvotes: 2

Calculate mean of rows when matching separate dataframe

Answers (3)

Related Questions