Reputation: 31
I have a data frame that looks like this
DF <- data.frame(x=rep(c("A", "B", "C"), times=1, each=3),
y=c(1,2,3))
which gives me
x y
1 A 1
2 A 2
3 A 3
4 B 1
5 B 2
6 B 3
7 C 1
8 C 2
9 C 3
In my original dataframe, each column represents a person. So I must match one single x for a single y, giving me something like
x y
A 1
B 2
C 3
In other words, I need y grouped by x, but the y must not be repeat along the dataframe.
Any ideas to help?
I really looked for it on stackoverflow, but couldn't find anything that would help me. Thank you!!
Upvotes: 1
Views: 109
Reputation: 39154
A solution using dplyr
, assuming that all groups have the same number of rows as the group numbers.
library(dplyr)
DF2 <- DF %>%
mutate(Group_ID = group_indices(., x)) %>%
group_by(x) %>%
summarise(y = y[first(Group_ID)]) %>%
ungroup()
DF2
# # A tibble: 3 x 2
# x y
# <fct> <dbl>
# 1 A 1
# 2 B 2
# 3 C 3
Or we can use the following:
DF2 <- DF %>% filter(as.numeric(x) == y)
DF2
# x y
# 1 A 1
# 2 B 2
# 3 C 3
This works because column x
is factor. When we convert it to numeric, we can filter with the value in y
directly.
Upvotes: 2