Reputation: 26238
Let's assume I have a dataset df wherein I want to match value of col X
across values of multiple columns A to F
and want to return the matching column name (else NA) in a new column.
dput
df <- structure(list(A = c(4L, NA, NA, NA), B = c(NA, 5L, NA, NA),
C = c(NA, NA, NA, NA), D = c(NA, 4L, 6L, 7L), E = c(5L, NA,
NA, NA), F = c(NA, NA, NA, NA), X = 4:7), class = "data.frame", row.names = c(NA,
-4L))
> df
A B C D E F X
1 4 NA NA NA 5 NA 4
2 NA 5 NA 4 NA NA 5
3 NA NA NA 6 NA NA 6
4 NA NA NA 7 NA NA 7
The output I want
> df_out
A B C D E F X new
1 4 NA NA NA 5 NA 4 A
2 NA 5 NA 4 NA NA 5 B
3 NA NA NA 6 NA NA 6 D
4 NA NA NA 7 NA NA 7 D
I will prefer dplyr
/tidyverse
syntax that I'll integrate into my existing syntax.
Upvotes: 2
Views: 977
Reputation: 389135
Compare A:F
column with X
column, replace NA
's with FALSE
and use max.col
to get the index of TRUE
value in each row which can be used to get column name.
library(dplyr)
df %>%
mutate(new = {
tmp <- select(., A:F) == X
names(.)[max.col(replace(tmp, is.na(tmp), FALSE))]
})
# A B C D E F X new
#1 4 NA NA NA 5 NA 4 A
#2 NA 5 NA 4 NA NA 5 B
#3 NA NA NA 6 NA NA 6 D
#4 NA NA NA 7 NA NA 7 D
In base R this can be written as :
tmp <- df[1:5] == df$X
df$new <- names(df)[max.col(replace(tmp, is.na(tmp), FALSE))]
This solution assumes that you'll have at least one match of X
in the row as shown in the example.
Upvotes: 2
Reputation: 40051
One option could be:
df %>%
rowwise() %>%
mutate(new = names(.)[which(c_across(-X) %in% X)])
A B C D E F X new
<int> <int> <lgl> <int> <int> <lgl> <int> <chr>
1 4 NA NA NA 5 NA 4 A
2 NA 5 NA 4 NA NA 5 B
3 NA NA NA 6 NA NA 6 D
4 NA NA NA 7 NA NA 7 D
The solution above assumes that the column names correspond to the positions established by which()
. However, if it is not the case (e.g. c_across(-c(C, E, X)
), then the results won't be correct. A solution for more complex situations could be:
df %>%
mutate(new = Reduce(coalesce, across(-c(C, E, X), ~ ifelse(. == X, cur_column(), NA_character_))))
A B C D E F X new
1 4 NA NA NA 5 NA 4 A
2 NA 5 NA 4 NA NA 5 B
3 NA NA NA 6 NA NA 6 D
4 NA NA NA 7 NA NA 7 D
Upvotes: 4
Reputation: 3047
df %>%
pivot_longer(cols = -X) %>%
mutate(
match = if_else(X == value, name, NA_character_)
) %>%
pivot_wider() %>%
filter(!is.na(match))
Upvotes: 2