Reputation: 605
I have a data that represents assets returns. I want to select top N assets from each row and calculate mean of return of that selected assets. In detail,I want to make a function that would select different element from a row and will do mean of that elements. Like from first row I want to select top 3 elements based on the ranking and calculate mean of these. From second I want to select top 5 and mean of it and so on. I mean number of elements will vary from row to row. I have try to make a example. In following example test_data represent data from which assets to be taken and top_n represent number of assets to be taken from each row and Rank represent ranking of the assets on the basis of the which assets will be selected. Like from first row I want to select top three assets according to rank. In second I want to select top 5 elements according to rank.In the end i will get mean of returns of top n assets in each row.
test_data<-matrix(rnorm(100),nrow=10)
rank<-apply(-test,1,rank)
top_n<-c(3,5,9,4,8,7,6,8,3,2,4)
Upvotes: 2
Views: 131
Reputation: 887173
We can use mapply
with asplit
from base R
mapply(function(dat, n) mean(tail(sort(dat), n)), asplit(test_data, 1), top_n)
#[1] 0.8813500 0.2114054 0.2584815 1.2650171 0.2365432 1.0525673
#[7] 0.9391072 0.1261873 0.8011962 1.6519498
set.seed(123)
test_data<- matrix(rnorm(100),nrow=10)
top_n<-c(3,5,9,4,8,7,6,8,3,2)
Upvotes: 1
Reputation: 388982
We can use sapply
to loop over every row, select it's respective top_n
element using tail
and take mean
of it.
sapply(seq_along(top_n), function(x) mean(tail(sort(test_data[x, ]), top_n[x])))
#[1] 0.881 0.211 0.258 1.265 0.237 1.053 0.939 0.126 0.801 1.652
data
set.seed(123)
test_data<- matrix(rnorm(100),nrow=10)
top_n<-c(3,5,9,4,8,7,6,8,3,2)
Upvotes: 3