Reputation: 39
How would I go about finding the middle option or "average" in the example below? I do not want to create new values by taking the mean of all columns and taking the median also does not work in this case. I need to be able to figure out that the blue one (col_5) is the "middle". Any tips? Thanks!
col_1 <- c(0,32,34,36,37,41,43,44,47,48,50)
col_2 <- c(0,3,4,5,6,7,9,14,16,18,20)
col_3 <- c(0,22,23,25,28,31,32,35,38,39,41)
col_4 <- c(0,1,2,3,5,6,8,9,11,13,15)
col_5 <- c(0,2,5,9,11,15,25,33,36,37,38)
df1 <- data.frame(col_1, col_2, col_3, col_4, col_5)
plot(df1$col_1, type ="l")
lines(df1$col_2)
lines(df1$col_3)
lines(df1$col_4)
lines(df1$col_5, col='blue')
Upvotes: 1
Views: 47
Reputation: 2722
You'll need to tweak how I arrived at returning the "middle" result, but basically from your question I take your problem as:
For all columns in a table, find the average, then determine which of those is the 'middle' or median
So to accomplish this I suggest iterating over the columns to calculate the average the good ole fashioned way, using sum(x) / length(x)
essentially:
avgs <- sapply(df1, function(i){
sum(i) / nrow(df1)
})
> avgs
col_1 col_2 col_3 col_4 col_5
37.45454545 9.27272727 28.54545455 6.63636364 19.18181818
# Just giving you a visual here
> sort(avgs)
col_4 col_2 col_5 col_3 col_1
6.63636364 9.27272727 19.18181818 28.54545455 37.45454545
median
> avgs[which(avgs == median(avgs))]
col_5
19.1818182
# OR if you just need the name:
> names(which(avgs == median(avgs)))
[1] "col_5"
Upvotes: 1