wammy
wammy

Reputation: 17

Create a loop for calculating values from a dataframe in R?

Let's say I make a dummy dataframe with 6 columns with 10 observations:

X <- data.frame(a=1:10, b=11:20, c=21:30, d=31:40, e=41:50, f=51:60)

I need to create a loop that evaluates 3 columns at a time, adding the summed second and third columns and dividing this by the sum of the first column:

 (sum(b)+sum(c))/sum(a) ... (sum(e)+sum(f))/sum(d) ...

I then need to construct a final dataframe from these values. For example using the dummy dataframe above, it would look like:

        value
1.     7.454545
2.     2.84507

I imagine I need to use the next function to iterate within the loop, but I'm fairly lost! Thank you for any help.

Upvotes: 0

Views: 106

Answers (3)

akrun
akrun

Reputation: 887971

Here is an option with tidyverse

library(dplyr) # 1.0.0
library(tidyr)
X %>% 
     summarise(across(.fn = sum)) %>% 
     pivot_longer(everything()) %>% 
     group_by(grp = as.integer(gl(n(), 3, n()))) %>% 
     summarise(value = sum(lead(value)/first(value), na.rm = TRUE)) %>% 
     select(value)
# A tibble: 2 x 1
#  value
#  <dbl>
#1  7.45
#2  2.85

Upvotes: 0

Onyambu
Onyambu

Reputation: 79348

You could use tapply:

tapply(colSums(X), gl(ncol(X)/3, 3), function(x)sum(x[-1])/x[1])
       1        2 
7.454545 2.845070 

Upvotes: 1

IceCreamToucan
IceCreamToucan

Reputation: 28705

You can split your data frame into groups of 3 by creating a vector with rep where each element repeats 3 times. Then with this list of sub data frames, (s)apply the function of summing the second and third columns, adding them, and dividing by the sum of the first column.

out_vec <- 
  sapply(
    split.default(X, rep(1:ncol(X), each = 3, length.out = ncol(X)))
    , function(x) (sum(x[2]) + sum(x[3]))/sum(x[1]))

data.frame(value = out_vec)
#      value
# 1 7.454545
# 2 2.845070

You could also sum all the columns up front before the sapply with colSums, which will be more efficient.

out_vec <- 
  sapply(
    split(colSums(X), rep(1:ncol(X), each = 3, length.out = ncol(X)))
    , function(x) (x[2] + x[3])/x[1])

data.frame(value = out_vec, row.names = NULL)
#      value
# 1 7.454545
# 2 2.845070

Upvotes: 1

Related Questions