Reputation: 35
Not an extremely proficient programmer here so bear with me. I want to eliminate duplicities in variable 'B' but only within the same values of variable 'A'. That is so that I get only one 'a' value for the group of 1's and I don't eliminate it for the group of 2's.
A <- c(1,1,1,2,2,2)
B <- c('a','b','a','c','a','d')
ab <- cbind(A,B)
AB <- as.data.frame(ab)
Thank you beforehand! Hope it was clear enough.
Upvotes: 2
Views: 52
Reputation: 195
You may also want to take a look at the duplicated()
function. Your example
a <- c(1,1,1,2,2,2)
b <- c('a','b','a','c','a','d')
ab <- cbind(a,b)
ab_df <- as.data.frame(ab)
gives you the following data frame:
> ab_df
a b
1 1 a
2 1 b
3 1 a
4 2 c
5 2 a
6 2 d
Obviously row 3 duplicates row 1. duplicated(ab_df)
returns a logical vector indicating duplicated rows:
> duplicated(ab_df)
[1] FALSE FALSE TRUE FALSE FALSE FALSE
This in turn could be used to eliminate the duplicated rows from your original data frame:
> d <- duplicated(ab_df)
> ab_df[!d, ]
a b
1 1 a
2 1 b
4 2 c
5 2 a
6 2 d
Upvotes: 1
Reputation: 73397
You may use unique
which removes the duplicated rows of your data frame.
ab <- unique(ab)
ab
# A B
# 1 1 a
# 2 1 b
# 4 2 c
# 5 2 a
# 6 2 d
Upvotes: 1