hks
hks

Reputation: 189

mutate to create new columns with max and mins from other columns by group

Say my dataset has two columns

A = c(1,5,NA,NA,NA)
B = c(NA,NA,10,11,13)

how do I create a new column C that takes the max value from A and the min from B to create a new column:

C= c(NA,5,10,NA,NA)

Upvotes: 2

Views: 195

Answers (2)

Onyambu
Onyambu

Reputation: 79328

In base R you could also do:

replace(rep(NA, length(A)), 
         c(i1<-which.min(B), i2<-which.max(A)), c(B[i1], A[i2]))
[1] NA  5 10 NA NA

We could break this down as:

C <- rep(NA, length(A))
i1 <- which.min(B)
i2 <- which.max(A)
C[c(i1,i2)] <- c(B[i1], A[i2])
C
[1] NA  5 10 NA NA

Upvotes: 1

r2evans
r2evans

Reputation: 160687

In this case, if which.max(A) == which.min(B), then A wins ... that can be changed by swapping the order within case_when or reversing the nesting of the base R ifelses.

dplyr

library(dplyr)
quux %>%
  mutate(
    C2 = case_when(
      row_number() == which.max(A) ~ A, 
      row_number() == which.min(B) ~ B, 
      TRUE ~ NA)
  )
#    A  B  C C2
# 1  1 NA NA NA
# 2  5 NA  5  5
# 3 NA 10 10 10
# 4 NA 11 NA NA
# 5 NA 13 NA NA

base R

(I don't really like nested ifelse, but this is uncomplicated ...)

with(quux, ifelse(seq_along(A) == which.max(A), A,
                  ifelse(seq_along(A) == which.min(B), B, NA)))
# [1] NA  5 10 NA NA

Data

quux <- structure(list(A = c(1, 5, NA, NA, NA), B = c(NA, NA, 10, 11, 13), C = c(NA, 5, 10, NA, NA)), class = "data.frame", row.names = c(NA, -5L))

(P.S.: if you know all numbers will be effectively integer, then you can shorten this a little by replacing seq_along(A) == which.max(A) with A == max(A), etc. The reason I don't start with that is that floating-point equality is imperfect in the presence of high-precision numbers, see Why are these numbers not equal? and https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f.)

Upvotes: 4

Related Questions