Eric Nilsen
Eric Nilsen

Reputation: 101

Create a function for making an index out of different indicators across different datasets

I'm working with the European Social Survey, and have different dataframes for each country. All of these dataframes are equal except for the values on each variable. What I would like to do is to create a new variable in each dataset that is equal to the sum of several other variabels. Is there a way to create a functions that does this for every dataframes?

What I have done before i simply creating a new column with: Data$new <- Data$old1 + Data$old2...etc. However, when working with several variables over several datasets this seams rather inefficient, and I'm quite sure that there must exist an easier way. I just don't know what to google.

Example:

I have two dataframes, A and B:

A1 <- c(1,2,3,4,5)
A2 <- c(6,7,8,9,10)
A <- data.frame(A1, A2)
B1 <- c(10,12,13,15,24)
B2 <- c(23,24,25,45,65)
B <- data.frame(B1, B2)

What I want to do is for each dataframe create a new column which is equal to the sum of the other two. Usually I would do that like this A$A3 <- A$A1 + A$A2 B$B3 <- B$B1 + B$B2

However, doing this across several dataframes with a large amount of variables seems like and inefficient way of doing it. Since the name of the variables are the same across the dataframes, is there a way to make a function that looks for said variable, and create the new one in a better way?

Upvotes: 1

Views: 159

Answers (2)

akrun
akrun

Reputation: 887911

An option with map/dplyr

library(tidyverse)
map(mget(c("A", "B")),  ~ .x %>% 
                            mutate(Total = reduce(., `+`)))

Upvotes: 1

NelsonGon
NelsonGon

Reputation: 13319

We can create a helper auto_add:

auto_add <- function(df, col_a, col_b){
  df$total <- rowSums(df[c(col_a,col_b)])
  df
}
auto_add(A,"A1","A2")

For many data sets and if the target columns are known, we could do:

auto_add <- function(df,target_cols){

  df$total <- rowSums(df[c(target_cols)])
  df
}
lapply(list(A,B),auto_add,target_cols=1:2) 

Result:

[[1]]
  A1 A2 total
1  1  6     7
2  2  7     9
3  3  8    11
4  4  9    13
5  5 10    15

[[2]]
  B1 B2 total
1 10 23    33
2 12 24    36
3 13 25    38
4 15 45    60
5 24 65    89

Upvotes: 1

Related Questions