Reputation: 189
Say my dataset has two columns
A = c(1,5,NA,NA,NA)
B = c(NA,NA,10,11,13)
how do I create a new column C that takes the max value from A and the min from B to create a new column:
C= c(NA,5,10,NA,NA)
Upvotes: 2
Views: 195
Reputation: 79328
In base R you could also do:
replace(rep(NA, length(A)),
c(i1<-which.min(B), i2<-which.max(A)), c(B[i1], A[i2]))
[1] NA 5 10 NA NA
We could break this down as:
C <- rep(NA, length(A))
i1 <- which.min(B)
i2 <- which.max(A)
C[c(i1,i2)] <- c(B[i1], A[i2])
C
[1] NA 5 10 NA NA
Upvotes: 1
Reputation: 160687
In this case, if which.max(A) == which.min(B)
, then A
wins ... that can be changed by swapping the order within case_when
or reversing the nesting of the base R ifelse
s.
library(dplyr)
quux %>%
mutate(
C2 = case_when(
row_number() == which.max(A) ~ A,
row_number() == which.min(B) ~ B,
TRUE ~ NA)
)
# A B C C2
# 1 1 NA NA NA
# 2 5 NA 5 5
# 3 NA 10 10 10
# 4 NA 11 NA NA
# 5 NA 13 NA NA
(I don't really like nested ifelse
, but this is uncomplicated ...)
with(quux, ifelse(seq_along(A) == which.max(A), A,
ifelse(seq_along(A) == which.min(B), B, NA)))
# [1] NA 5 10 NA NA
Data
quux <- structure(list(A = c(1, 5, NA, NA, NA), B = c(NA, NA, 10, 11, 13), C = c(NA, 5, 10, NA, NA)), class = "data.frame", row.names = c(NA, -5L))
(P.S.: if you know all numbers will be effectively integer, then you can shorten this a little by replacing seq_along(A) == which.max(A)
with A == max(A)
, etc. The reason I don't start with that is that floating-point equality is imperfect in the presence of high-precision numbers, see Why are these numbers not equal? and https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f.)
Upvotes: 4