Reputation: 17
Let's say I make a dummy dataframe with 6 columns with 10 observations:
X <- data.frame(a=1:10, b=11:20, c=21:30, d=31:40, e=41:50, f=51:60)
I need to create a loop that evaluates 3 columns at a time, adding the summed second and third columns and dividing this by the sum of the first column:
(sum(b)+sum(c))/sum(a) ... (sum(e)+sum(f))/sum(d) ...
I then need to construct a final dataframe from these values. For example using the dummy dataframe above, it would look like:
value
1. 7.454545
2. 2.84507
I imagine I need to use the next function to iterate within the loop, but I'm fairly lost! Thank you for any help.
Upvotes: 0
Views: 106
Reputation: 887971
Here is an option with tidyverse
library(dplyr) # 1.0.0
library(tidyr)
X %>%
summarise(across(.fn = sum)) %>%
pivot_longer(everything()) %>%
group_by(grp = as.integer(gl(n(), 3, n()))) %>%
summarise(value = sum(lead(value)/first(value), na.rm = TRUE)) %>%
select(value)
# A tibble: 2 x 1
# value
# <dbl>
#1 7.45
#2 2.85
Upvotes: 0
Reputation: 79348
You could use tapply
:
tapply(colSums(X), gl(ncol(X)/3, 3), function(x)sum(x[-1])/x[1])
1 2
7.454545 2.845070
Upvotes: 1
Reputation: 28705
You can split
your data frame into groups of 3 by creating a vector with rep
where each element repeats 3 times. Then with this list of sub data frames, (s)apply the function of summing the second and third columns, adding them, and dividing by the sum of the first column.
out_vec <-
sapply(
split.default(X, rep(1:ncol(X), each = 3, length.out = ncol(X)))
, function(x) (sum(x[2]) + sum(x[3]))/sum(x[1]))
data.frame(value = out_vec)
# value
# 1 7.454545
# 2 2.845070
You could also sum all the columns up front before the sapply with colSums
, which will be more efficient.
out_vec <-
sapply(
split(colSums(X), rep(1:ncol(X), each = 3, length.out = ncol(X)))
, function(x) (x[2] + x[3])/x[1])
data.frame(value = out_vec, row.names = NULL)
# value
# 1 7.454545
# 2 2.845070
Upvotes: 1