user183974
user183974

Reputation: 183

how to check values in one column are all identical by a second grouping variable?

I am using r to analyse some data that is in long format. I have one column that is a grouping variable which contains participant IDs and another variable that contains their sex.

e.g.

ID SEX
1   M
1   M
2   F
2   F
2   M

I would like to check whether there are any IDs which do not have sex coded consistently e.g. ID=2 above. Is there a way to do this? I have been playing around with dplyr and the group_by function, but I am at a loss. Any help would be greatly appreciated.

In terms of output, I would probably like a vector of all unique ID values that have non-identical values in the SEX column.

Upvotes: 0

Views: 60

Answers (2)

Shree
Shree

Reputation: 11150

Here's a base R soultion using ave() -

df[ave(df$SEX, df$ID, FUN = function(x) length(unique(x))) > 1, ]

  ID SEX
3  2   F
4  2   F
5  2   M

Upvotes: 1

Serkan Arslan
Serkan Arslan

Reputation: 13403

You can try this.

require(plyr)

df <- data.frame(c(1,1,2,2,2), c('M','M','F','F','M'))
names(df) <- c('ID','SEX')

df2 <- ddply(df,.(ID), mutate, count = length(unique(SEX)))
unique(df2[df2$count > 1,][1])

Result:

ID
2

Upvotes: 0

Related Questions