chippycentra
chippycentra

Reputation: 3432

Subset dataframe by unique values within a column in R

Hello I have a dataframe such as

Group COL1 Event 
G1 SP1  1
G1 SP2  1
G1 SP3  2
G1 SP3  2 
G2 SP4  3
G2 SP7  3
G2 SP5  6
G3 SP1  1 
G4 SP1  6  

And I want to keep only COL1 if Event is unique (so here for exemple SP3 and SP5 are unique within the column Event).

Then I should get :

Group COL1 Event 
G1 SP3  2
G1 SP3  2 
G2 SP5  6 
G3 SP1  1 
G4 SP1  6 

SP1 and SP2 were 2 in column Event1 so they do not pass

SP4 and SP7 were 2 in column Event3 so they do not pass

Upvotes: 1

Views: 2585

Answers (3)

ThomasIsCoding
ThomasIsCoding

Reputation: 101034

Another data.table option

> setDT(df)[,.SD[uniqueN(COL1)==1],.(Group,Event)]
   Group Event COL1
1:    G1     2  SP3
2:    G1     2  SP3
3:    G2     6  SP5
4:    G3     1  SP1
5:    G4     6  SP1

Upvotes: 1

akrun
akrun

Reputation: 886938

An option with base R using ave

subset(df, ave(COL1, Group, Event,
      FUN = function(x) length(unique(x))) == 1)
#  Group COL1 Event
#3    G1  SP3     2
#4    G1  SP3     2
#7    G2  SP5     6
#8    G3  SP1     1
#9    G4  SP1     6
 

Upvotes: 2

IceCreamToucan
IceCreamToucan

Reputation: 28675

You can use data.table to group by Group and Event and only return the group contents (.SD) if the number of unique COL1 values (uniqueN(COL1)) is 1.

library(data.table)
setDT(df)

df[, if(uniqueN(COL1) == 1) .SD, by = .(Group, Event)]
#    Group Event COL1
# 1:    G1     2  SP3
# 2:    G1     2  SP3
# 3:    G2     6  SP5
# 4:    G3     1  SP1
# 5:    G4     6  SP1

Data used:

df <- fread('
Group COL1 Event 
G1 SP1  1
G1 SP2  1
G1 SP3  2
G1 SP3  2 
G2 SP4  3
G2 SP7  3
G2 SP5  6
G3 SP1  1 
G4 SP1  6  
')

Upvotes: 3

Related Questions