Reputation:
Data I have:
A | B |
---|---|
1 | a |
2 | c |
2 | e |
3 | f |
4 | h |
5 | c |
5 | e |
What I want:
A | B | Group |
---|---|---|
1 | a | 1 |
2 | c | 2 |
2 | e | 2 |
3 | f | 3 |
4 | h | 4 |
5 | c | 2 |
5 | e | 2 |
Code I attempted:
library(readxl)
library(dplyr)
library(stringr)
data1 <- read_excel("testing.xlsx")
data2 <- data1 %>%
group_by(A) %>%
group_by(B) %>%
mutate(Group = cur_group_id()) %>%
ungroup()
What I’m getting from this code:
A | B | Group |
---|---|---|
1 | a | 1 |
2 | c | 2 |
2 | e | 3 |
3 | f | 4 |
4 | h | 5 |
5 | c | 2 |
5 | e | 3 |
EDIT: I get the error — “Can’t supply ‘.by’ when ‘.data’ is a grouped data frame.” for all of the comments below. The original data I am manipulating has been left-joined and then grouped. How do I approach this?
Upvotes: 4
Views: 148
Reputation: 24845
You can do this:
setDT(df)[df[, .(B=paste0(sort(B), collapse="")),A][, Group:=min(A), B], on="A", .(A, B, Group)]
Output:
A B Group
<int> <char> <int>
1: 1 a 1
2: 2 c 2
3: 2 e 2
4: 3 f 3
5: 4 h 4
6: 5 c 2
7: 5 e 2
But, as others have pointed out, this only works because B can readily be sorted and pasted together, and A is numeric.
Upvotes: 2
Reputation: 102529
You can try below
library(dplyr)
df %>%
left_join(
(.) %>%
summarise(group = as.factor(toString(sort(B))), .by = A) %>%
mutate(group = as.integer(group))
)
or you can use membership
from igraph
package in addition
library(dplyr)
library(igraph)
df %>%
mutate(group = {
(.) %>%
graph_from_data_frame() %>%
components() %>%
membership()
}[B])
which gives
A B group
1 1 a 1
2 2 c 2
3 2 e 2
4 3 f 3
5 4 h 4
6 5 c 2
7 5 e 2
igraph
interest)df %>%
graph_from_data_frame() %>%
plot()
Upvotes: 6
Reputation: 79328
df %>%
mutate(group = toString(B), .by=A)%>%
mutate(group = A[match(group, group)], .by=group)
A B group
1 1 a 1
2 2 c 2
3 2 e 2
4 3 f 3
5 4 h 4
6 5 c 2
7 5 e 2
Upvotes: 1
Reputation: 18712
library(dplyr)
data1 |>
mutate(group = paste(sort(B), collapse = ""), .by = A) |>
mutate(group = cur_group_id(), .by = group)
Output
A B group
1 1 a 1
2 2 c 2
3 2 e 2
4 3 f 3
5 4 h 4
6 5 c 2
7 5 e 2
Upvotes: 2