Reputation: 1113

Classifying column in data.frame based on vectors

In the following data.frame df I would like to create a new column with values that will be derived from classification of A column. If number if A column corresponds to one of the numbers in G1 vector, in new column called Group it should be classified as "G1". Similarly if value in A column corresponds to one of the values in G2 vector, it should be classified as "G2". Remaining columns should be classified as "G0".

A <- seq(1900,2000,1)
B <- rnorm(101,10,2)
df <- data.frame(A=A,B=B)

G1 <- c(1963,1982,1952)
G2 <- c(1920,1933,1995)

# This doesn't do what I would like it to achieve
df$group <- ifelse(df$A == G1,"G1",ifelse(df$A == G2,"G2","G0"))

Upvotes: 2

Answers (3)

Julius Vainora

Reputation: 48211

Here's a fun and concise alternative:

df$group <- c("G0", "G1", "G2")[1 + 1 * df$A %in% G1 + 2 * df$A %in% G2]

We have a vector of three options c("G0", "G1", "G2"). Thinking element-wise, if none of df$A %in% G1 and df$A %in% G2 are true, we choose "G0" (due to 1 + ... at the beginning). Since G1 and G2 don't overlap, "G1" will be chosen only if df$A %in% G1. Similarly, the index is 3 and "G2" is chosen only if df$A %in% G2.

Upvotes: 1

Ronak Shah

Reputation: 389047

What you are looking is

df$group <- ifelse(df$A %in% G1, "G1", ifelse(df$A %in% G2, "G2", "G0"))

which can be better represented with case_when from dplyr

library(dplyr)
df %>%
   mutate(group = case_when(A %in% G1 ~ "G1", 
                            A %in% G2 ~ "G2", 
                            TRUE ~ "G0"))

Upvotes: 3

Joseph Clark McIntyre

Reputation: 1094

The problem is that you don't want to test whether a value in the column is equal to A or B; those are vectors and that test doesn't make sense. Instead, you want to know whether the value is an element of A or B. Tweak your code to

df$group <- ifelse(df$A %in% G1,"G1",ifelse(df$A %in% G2,"G2","G0"))

This worked when I checked it. There may be a more elegant solution, but this is closely aligned to your first attempt.

Upvotes: 1

Classifying column in data.frame based on vectors

Answers (3)

Related Questions