Creating a new data frame with calculations from an existing data frame using R

Question

I am looking to create a new data frame of averages based off a population data frame. Column A is a list of ages followed by the population of each for a number of years. I am wondering what the best approach to creating a new data frame with the average age for each year. The formula I am using to calculate the average is sum(age*year)/sum(year)

For example this is what I have:

Age | 2000 | 2001 | 2002 
------------------------
2   | 4    | 1    | 2   
3   | 6    | 3    | 5  
4   | 10   | 9    | 8

and I would like a new data frame like this:

Year| Avg_age 
--------------
2000 | 3.3    
2001 | 3.6
2002 | 3.4

akrun · Accepted Answer

We can reshape to 'long' format and get the summarised output grouped by 'Year'

library(dplyr)
library(tidyr)
df1 %>%
   pivot_longer(cols = -Age, names_to = 'Year') %>%
   group_by(Year) %>%
   summarise(Avg_age = sum(Age * value)/sum(value), .groups = 'drop')

-output

# A tibble: 3 x 2
#  Year  Avg_age
#*    
#1 2000     3.3 
#2 2001     3.62
#3 2002     3.4

Or it can be done in reverse as well i.e. first get the summarised output and then do the reshaping

df1 %>%
   summarise(across(-Age, ~  sum(Age * .)/sum(.))) %>% 
   pivot_longer(cols = everything(), names_to = 'Year', 
        values_to = 'Avg_age')
# A tibble: 3 x 2
#  Year  Avg_age
#     
#1 2000     3.3 
#2 2001     3.62
#3 2002     3.4

Or using dapply with stack

library(collapse)
stack(dapply(df1[-1], function(x) sum(x * df1$Age)/sum(x)))[2:1]

data

df1 <- structure(list(Age = 2:4, `2000` = c(4, 6, 10), `2001` = c(1, 
3, 9), `2002` = c(2, 5, 8)), row.names = c(NA, -3L), class = "data.frame")

Creating a new data frame with calculations from an existing data frame using R

Answers (1)

data

Related Questions