Conditional sum data in R

Question

I have an R data frame with many columns, and I want to sum only columns (header: score) having either cell value >25 or >-25 under row named "Matt". The sum value can be placed after the last column.

input (df1)

Name	score	score	score	score	score	score	score
Alex	31	15	18	22	23	23	23
Pat	37	18	29	15	28	28	-28
Matt	33	27	18	88	9	-19	-29
James	12	-36	32	13	21	21	21

output (df2)

Name	score	score	score	score	score	acore	score	sum
Alex	31	15	18	22	23	23	23	91
Pat	37	18	29	15	28	28	-28	42
Matt	33	27	18	88	9	-19	-29	119
James	12	-36	32	13	21	21	21	10

Any thoughts are more than welcome,

Regards,

akrun · Accepted Answer

We create a logical vector based on the 'Name' column ('i1'), then use the OR (| condition on the value 25 and -25 with relational operators (> or < respectively) to create a logical index for the columns. Subset the dataset based on the 'i2', and return the rowSums of those columns and assign it to 'sum' column

i1 <-df1$Name == "Matt" 

i2 <- df1[i1,-1] > 25|df1[i1,-1] < -25
df1$sum <- rowSums(df1[-1][,i2], na.rm = TRUE)

Or using dplyr

library(dplyr)
df1 %>% 
   mutate(Matt = rowSums(select(cur_data(), 
            where(~ is.numeric(.) &&
          (.[Name == 'Matt'] > 25| .[Name == 'Matt'] < -25)))))

-output

#    Name score score.1 score.2 score.3 score.4 score.5 score.6 Matt
#1  Alex    31      15      18      22      23      23      23   91
#2   Pat    37      18      29      15      28      28     -28   42
#3  Matt    33      27      18      88       9     -19     -29  119
#4 James    12     -36      32      13      21      21      21   10

Conditional sum data in R

Answers (2)

Related Questions