user15290979
user15290979

Reputation:

Is there an R function that checks whether all values in a group are the same as all values in another group?

Data I have:

A B
1 a
2 c
2 e
3 f
4 h
5 c
5 e

What I want:

A B Group
1 a 1
2 c 2
2 e 2
3 f 3
4 h 4
5 c 2
5 e 2

Code I attempted:

library(readxl)
library(dplyr)
library(stringr)
data1 <- read_excel("testing.xlsx")
data2 <- data1 %>% 
  group_by(A) %>% 
  group_by(B) %>% 
  mutate(Group = cur_group_id()) %>% 
  ungroup()

What I’m getting from this code:

A B Group
1 a 1
2 c 2
2 e 3
3 f 4
4 h 5
5 c 2
5 e 3

EDIT: I get the error — “Can’t supply ‘.by’ when ‘.data’ is a grouped data frame.” for all of the comments below. The original data I am manipulating has been left-joined and then grouped. How do I approach this?

Upvotes: 4

Views: 148

Answers (4)

langtang
langtang

Reputation: 24845

You can do this:

setDT(df)[df[, .(B=paste0(sort(B), collapse="")),A][, Group:=min(A), B], on="A", .(A, B, Group)]

Output:

       A      B Group
   <int> <char> <int>
1:     1      a     1
2:     2      c     2
3:     2      e     2
4:     3      f     3
5:     4      h     4
6:     5      c     2
7:     5      e     2

But, as others have pointed out, this only works because B can readily be sorted and pasted together, and A is numeric.

Upvotes: 2

ThomasIsCoding
ThomasIsCoding

Reputation: 102529

You can try below

library(dplyr)
df %>%
    left_join(
        (.) %>%
            summarise(group = as.factor(toString(sort(B))), .by = A) %>%
            mutate(group = as.integer(group))
    )

or you can use membership from igraph package in addition

library(dplyr)
library(igraph)
df %>%
    mutate(group = {
        (.) %>%
            graph_from_data_frame() %>%
            components() %>%
            membership()
    }[B])

which gives

  A B group
1 1 a     1
2 2 c     2
3 2 e     2
4 3 f     3
5 4 h     4
6 5 c     2
7 5 e     2

bonus (for the igraph interest)

df %>%
    graph_from_data_frame() %>%
    plot()

shows the groups enter image description here

Upvotes: 6

Onyambu
Onyambu

Reputation: 79328

 df %>%
   mutate(group = toString(B), .by=A)%>%
   mutate(group = A[match(group, group)], .by=group)

  A B group
1 1 a     1
2 2 c     2
3 2 e     2
4 3 f     3
5 4 h     4
6 5 c     2
7 5 e     2

Upvotes: 1

LMc
LMc

Reputation: 18712

library(dplyr)

data1 |>
  mutate(group = paste(sort(B), collapse = ""), .by = A) |>
  mutate(group = cur_group_id(), .by = group)

Output

  A B group
1 1 a     1
2 2 c     2
3 2 e     2
4 3 f     3
5 4 h     4
6 5 c     2
7 5 e     2

Upvotes: 2

Related Questions