simar
simar

Reputation: 605

selecting the top n elements from a row and taking their mean

I have a data that represents assets returns. I want to select top N assets from each row and calculate mean of return of that selected assets. In detail,I want to make a function that would select different element from a row and will do mean of that elements. Like from first row I want to select top 3 elements based on the ranking and calculate mean of these. From second I want to select top 5 and mean of it and so on. I mean number of elements will vary from row to row. I have try to make a example. In following example test_data represent data from which assets to be taken and top_n represent number of assets to be taken from each row and Rank represent ranking of the assets on the basis of the which assets will be selected. Like from first row I want to select top three assets according to rank. In second I want to select top 5 elements according to rank.In the end i will get mean of returns of top n assets in each row.

test_data<-matrix(rnorm(100),nrow=10)
rank<-apply(-test,1,rank)
top_n<-c(3,5,9,4,8,7,6,8,3,2,4)

Upvotes: 2

Views: 131

Answers (2)

akrun
akrun

Reputation: 887173

We can use mapply with asplit from base R

mapply(function(dat, n) mean(tail(sort(dat), n)), asplit(test_data, 1), top_n)
#[1] 0.8813500 0.2114054 0.2584815 1.2650171 0.2365432 1.0525673
#[7] 0.9391072 0.1261873 0.8011962 1.6519498

data

set.seed(123)
test_data<- matrix(rnorm(100),nrow=10)
top_n<-c(3,5,9,4,8,7,6,8,3,2)

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 388982

We can use sapply to loop over every row, select it's respective top_n element using tail and take mean of it.

sapply(seq_along(top_n), function(x) mean(tail(sort(test_data[x, ]), top_n[x])))
#[1] 0.881 0.211 0.258 1.265 0.237 1.053 0.939 0.126 0.801 1.652

data

set.seed(123)
test_data<- matrix(rnorm(100),nrow=10)
top_n<-c(3,5,9,4,8,7,6,8,3,2)

Upvotes: 3

Related Questions