Reputation: 1113
In the following data.frame df
I would like to create a new column with values that will be derived from classification of A
column. If number if A
column corresponds to one of the numbers in G1
vector, in new column called Group
it should be classified as "G1". Similarly if value in A
column corresponds to one of the values in G2
vector, it should be classified as "G2". Remaining columns should be classified as "G0".
A <- seq(1900,2000,1)
B <- rnorm(101,10,2)
df <- data.frame(A=A,B=B)
G1 <- c(1963,1982,1952)
G2 <- c(1920,1933,1995)
# This doesn't do what I would like it to achieve
df$group <- ifelse(df$A == G1,"G1",ifelse(df$A == G2,"G2","G0"))
Upvotes: 2
Views: 805
Reputation: 48211
Here's a fun and concise alternative:
df$group <- c("G0", "G1", "G2")[1 + 1 * df$A %in% G1 + 2 * df$A %in% G2]
We have a vector of three options c("G0", "G1", "G2")
. Thinking element-wise, if none of df$A %in% G1
and df$A %in% G2
are true, we choose "G0" (due to 1 + ...
at the beginning). Since G1
and G2
don't overlap, "G1" will be chosen only if df$A %in% G1
. Similarly, the index is 3 and "G2" is chosen only if df$A %in% G2
.
Upvotes: 1
Reputation: 389047
What you are looking is
df$group <- ifelse(df$A %in% G1, "G1", ifelse(df$A %in% G2, "G2", "G0"))
which can be better represented with case_when
from dplyr
library(dplyr)
df %>%
mutate(group = case_when(A %in% G1 ~ "G1",
A %in% G2 ~ "G2",
TRUE ~ "G0"))
Upvotes: 3
Reputation: 1094
The problem is that you don't want to test whether a value in the column is equal to A or B; those are vectors and that test doesn't make sense. Instead, you want to know whether the value is an element of A or B. Tweak your code to
df$group <- ifelse(df$A %in% G1,"G1",ifelse(df$A %in% G2,"G2","G0"))
This worked when I checked it. There may be a more elegant solution, but this is closely aligned to your first attempt.
Upvotes: 1