anderwyang
anderwyang

Reputation: 2441

How to rowsum dataframe avoid error message and in a smart way

There is dataframe ori_df , how to rowsum it avoid error message or in a smart way ? refer below seqarately question 1 \question 2

ori_df <- data.frame(values = 1:10) %>% t() %>% as.data.frame()
colnames(ori_df) <- LETTERS[1:10]

map_list <- list('group_a' = c('A','D','E'),'group_b' = c('G','H','Z'))

question 1: There is no varialbe Z in ori_df , how to avoid error and show group_b equal G + H (How to fix below code?)

group_df <- ori_df %>% mutate(group_a = A + D + E,
                              group_b = G + H + Z)

question 2: How to create varialbes group_a, group_b ,according map_list and avoid the error which Z not in variables of ori_df (How to fix below code?)

group_df <- ori_df %>% rowwise() %>% mutate(sum=sum(c_across(map_list)))

Upvotes: 2

Views: 91

Answers (4)

s_baldur
s_baldur

Reputation: 33603

I would use intersect():

foo_task1 <- function(df, group_list) {
  df[names(group_list)] <- lapply(
    group_list,
    \(lst) rowSums(df[intersect(names(df), lst)])
  )
  df
}

foo_task1(ori_df, map_list)
#        A B C D E F G H I  J group_a group_b
# values 1 2 3 4 5 6 7 8 9 10      10      15

Upvotes: 2

Edward
Edward

Reputation: 19339

For the first question, you can use a function that checks if any element of the list exists, and then remove it accordingly before summing.

add <- function(...){
  mc <- match.call(expand.dots = FALSE)$`...`
  e <- sapply(mc, exists, where=ori_df)
  sum(sapply(mc[e], eval, envir=ori_df), na.rm=TRUE)
}

And then modify the code slightly:

ori_df %>% mutate(group_a = add(A, D, E),
                  group_b = add(G, H, Z))

       A B C D E F G H I  J group_a group_b
values 1 2 3 4 5 6 7 8 9 10      10      15

For the second question, use any_of:

mutate(ori_df, 
       group_a = sum(c_across(any_of(map_list$group_a))),
       group_b = sum(c_across(any_of(map_list$group_b))))

       A B C D E F G H I  J group_a group_b
values 1 2 3 4 5 6 7 8 9 10      10      15

library(dplyr)

Upvotes: 0

benson23
benson23

Reputation: 19107

I'm not sure if dplyr is the best way to go. With base R you just iterate through the map_list for indexing the target columns and combine the results with original ori_df.

cbind(ori_df, lapply(map_list, \(x) sum(ori_df[, colnames(ori_df) %in% x])))

  A B C D E F G H I  J group_a group_b
1 1 2 3 4 5 6 7 8 9 10      10      15

Upvotes: 3

ThomasIsCoding
ThomasIsCoding

Reputation: 102529

You can try

cbind(
    ori_df,
    lapply(
        map_list,
        \(x) sum(t(ori_df)[match(x, names(ori_df)), ], na.rm = TRUE)
    )
)

which gives

       A B C D E F G H I  J group_a group_b
values 1 2 3 4 5 6 7 8 9 10      10      15

Upvotes: 3

Related Questions