Identify the groups sharing a duplicated value

Question

I have a data frame summarizing the deployment history of different pieces of equipment identified by a serial number. Over time, the equipment (serno) can be used by different projects but can only be deployed on one project at any given time. I generated a new column called used.elsewhere that identifies if the serno in that row was duplicated within the df. For example:

project = c("a","b","c","c","c","d","e")
serno = c(1,2,2,2,3,3,3)
deployed = c(T,T,F,F,F,F,T)
used.elsewhere = c(F,T,T,T,T,T,T)
data.frame(project,serno,deployed,used.elsewhere)

  project serno deployed used.elsewhere
1       a     1     TRUE          FALSE
2       b     2     TRUE           TRUE
3       c     2    FALSE           TRUE
4       c     2    FALSE           TRUE
5       c     3    FALSE           TRUE
6       d     3    FALSE           TRUE
7       e     3     TRUE           TRUE

I want to generate a new column that, if the serno value is not deployed and is used elsewhere, indicates the project where the serno is deployed:

project = c("a","b","c","c","c","d","e")
serno = c(1,2,2,2,3,3,3)
deployed = c(T,T,F,F,F,F,T)
used.elsewhere = c(F,T,T,T,T,T,T)
other.project = c(NA, NA, "b", "b", "e", "e", NA)

  project serno deployed used.elsewhere other.project
1       a     1     TRUE          FALSE          
2       b     2     TRUE           TRUE          
3       c     2    FALSE           TRUE             b
4       c     2    FALSE           TRUE             b
5       c     3    FALSE           TRUE             e
6       d     3    FALSE           TRUE             e
7       e     3     TRUE           TRUE

I'm assuming that I could use an ifelse statement such as the following but I'm not sure how to finish it.

df %>%
  mutate(other.project = ifelse(deployed == F & used.elsewhere == T, ...

Thank-you in advance!

Ben · Accepted Answer

If you first group_by(serno), you can include project where deployed is TRUE within the same group.

library(dplyr)

df %>%
  group_by(serno) %>%
  mutate(other.project = ifelse(
    deployed == FALSE & used.elsewhere == TRUE,
    project[deployed],
    NA
  ))

Output

  project serno deployed used.elsewhere other.project
                            
1 a           1 TRUE     FALSE          NA           
2 b           2 TRUE     TRUE           NA           
3 c           2 FALSE    TRUE           b            
4 c           2 FALSE    TRUE           b            
5 c           3 FALSE    TRUE           e            
6 d           3 FALSE    TRUE           e            
7 e           3 TRUE     TRUE           NA

Identify the groups sharing a duplicated value

Answers (2)

Related Questions