neòinean
neòinean

Reputation: 39

Taking the average without creating new values

How would I go about finding the middle option or "average" in the example below? I do not want to create new values by taking the mean of all columns and taking the median also does not work in this case. I need to be able to figure out that the blue one (col_5) is the "middle". Any tips? Thanks!

col_1 <- c(0,32,34,36,37,41,43,44,47,48,50)
col_2 <- c(0,3,4,5,6,7,9,14,16,18,20)
col_3 <- c(0,22,23,25,28,31,32,35,38,39,41)
col_4 <- c(0,1,2,3,5,6,8,9,11,13,15)
col_5 <- c(0,2,5,9,11,15,25,33,36,37,38)


df1 <- data.frame(col_1, col_2, col_3, col_4, col_5)

plot(df1$col_1, type ="l")
lines(df1$col_2)
lines(df1$col_3)
lines(df1$col_4)
lines(df1$col_5, col='blue')

enter image description here

Upvotes: 1

Views: 47

Answers (1)

Carl Boneri
Carl Boneri

Reputation: 2722

You'll need to tweak how I arrived at returning the "middle" result, but basically from your question I take your problem as:

For all columns in a table, find the average, then determine which of those is the 'middle' or median

So to accomplish this I suggest iterating over the columns to calculate the average the good ole fashioned way, using sum(x) / length(x) essentially:

avgs <- sapply(df1, function(i){
    sum(i) / nrow(df1)
})

> avgs
      col_1       col_2       col_3       col_4       col_5 
37.45454545  9.27272727 28.54545455  6.63636364 19.18181818 

# Just giving you a visual here
> sort(avgs)
      col_4       col_2       col_5       col_3       col_1 
 6.63636364  9.27272727 19.18181818 28.54545455 37.45454545 

So now we just want to know which value is our middle or median

> avgs[which(avgs == median(avgs))]
     col_5 
19.1818182 

# OR if you just need the name:

> names(which(avgs == median(avgs)))
[1] "col_5"

Upvotes: 1

Related Questions