KMISH
KMISH

Reputation: 49

to take mean for specific column

I have 208 column with each column having value in replicate(so total 104 sample X 2). I want to take mean of all replicates using R loop Can any one suggest me

w x y a b e
5 1 1 2 4 1
6 2 2 5 3 6
7 3 3 8 9 3
8 4 6 9 1 3

so for example i have have w, x, y, a, b, e columns i want to take avaerage of w and x, y and a , b and e and Print the average value into another data frame names as w_x, y_a, b_e.

Upvotes: 0

Views: 230

Answers (3)

acylam
acylam

Reputation: 18661

You can also do something like this with dplyr + tidyr:

library(dplyr)
library(tidyr)

cols = colnames(df)

data.frame(t(df)) %>%
  mutate(ID = rep(paste(cols[1:length(cols)%%2 == TRUE], cols[!1:length(cols)%%2], sep = "_"), each = 2)) %>%
  group_by(ID) %>%
  summarize_all(mean) %>%
  gather(variable, value, -ID) %>%
  spread(ID, value) %>%
  select(-variable)

Result:

# A tibble: 4 x 3
    b_e   w_x   y_a
* <dbl> <dbl> <dbl>
1   2.5     3   1.5
2   4.5     4   3.5
3   6.0     5   5.5
4   2.0     6   7.5

Data:

df = read.table(text = "w x y a b e
                 5 1 1 2 4 1
                 6 2 2 5 3 6
                 7 3 3 8 9 3
                 8 4 6 9 1 3", header = TRUE)

Upvotes: 1

nghauran
nghauran

Reputation: 6768

Here is a detailed example using a loop.

df <- data.frame(w = c(5, 6, 7, 8),
                 x = c(1, 2, 3, 4),
                 y = c(1, 2, 3, 6),
                 a = c(2, 5, 8, 9),
                 b = c(4, 3, 9, 1),
                 e = c(1, 6, 3, 3))
str(df)
# index of columns on which we will iterate
vect <- seq_len(ncol(df))[seq_len(ncol(df)) %% 2 != 0]
# Extract data frame columns every two columns
# initialize lists
new.lst <- list() # list of dataframes of two consecutive columns
ave.list <- list() # list of averages
for(i in seq_along(vect)){
        new.lst[[i]] <- df[, seq(from = vect[i], to = (vect[i] + 1))]
        ave.list[[i]] <- rowMeans(new.lst[[i]], na.rm = TRUE)
        names(ave.list)[i] <- paste(colnames(new.lst[[i]])[1],
                                    colnames(new.lst[[i]])[2],
                                    sep = "_") # set the names of columns
}
new.lst # list of dataframes of two consecutive columns - complete
ave.list # list of averages - complete
# final dataframe
df2 <- as.data.frame.list(ave.list)
df2

Upvotes: 0

mtcarsd <- mtcars[1:6]

To access first of two columns use c(T,F)

first_cols <- mtcarsd[c(T,F)]

sec_cols <- mtcarsd[c(F,T)]

fs <- first_cols+sec_cols

Use sapply function to find mean of needed column

sapply(fs, mean)

Upvotes: 0

Related Questions