Reputation: 183
I want to add a new column to a dataframe based on the condition of two columns.
I have the following data:
Animal.1 <- c("A", "B", "C", "B", "A" )
Animal.2 <- c("B", "A", "A", "C", "C")
df <- data.frame(Animal.1, Animal.2)
If the following conditions are met:
Animal.1 = A and Animal.2 = B OR Animal.1 = B and Animal.2 = A
I would like the new column called pair.code to equal 1.
I would like a different number for every pair of animal ids, but the same number to be used if the same animal id's are found in either Animal.1 and Animal.2 OR Animal.2 and Animal.1.
The final data should look like this:
Animal.1 <- c("A", "B", "C", "B", "A" )
Animal.2 <- c("B", "A", "A", "C", "C")
pair.code <- c("1", "1", "2", "3", "2")
df <- data.frame(Animal.1, Animal.2)
Upvotes: 1
Views: 206
Reputation: 173793
A solution using factor
:
df$pair.code <- as.numeric(factor(apply(df, 1, function(x) paste0(sort(x), collapse=""))))
df
#> Animal.1 Animal.2 pair.code
#> 1 A B 1
#> 2 B A 1
#> 3 C A 2
#> 4 B C 3
#> 5 A C 2
Upvotes: 2
Reputation: 886938
We can first sort
the elements by row and then create the 'pair.code' with match
m1 <- t(apply(df, 1, sort))
v1 <- paste(m1[,1], m1[,2])
df$pair.code <- match(v1, unique(v1))
df$pair.code
#[1] 1 1 2 3 2
Upvotes: 2