Boudewijn Aasman
Boudewijn Aasman

Reputation: 1256

Fast way to find if a value occurs within a group of a data frame

I want to find out whether an element occurs within a group of a data frame, and then label each row within the group with a 1 if it occurs, and a 0 if it does not occur.

For example, suppose I'm interested in whether or not the value 1 occurs in a group.

df1 = data.frame(group = c(1,1,1,1,2,2,2,2,2,3,3,3),value = c(1,4,3,2,2,1,1,4,2,2,6,2))

> df1
   group value
       1     1
       1     4
       1     3
       1     2
       2     2
       2     1
       2     1
       2     4
       2     2
       3     2
       3     6
       3     2

Then, I would like to create a new column that specifies whether or not the value of 1 occurred anywhere in that group.

This is what it should look like:

> df1
   group value hasValue
1      1     1        yes
2      1     4        yes
3      1     3        yes
4      1     2        yes
5      2     2        yes
6      2     1        yes
7      2     1        yes
8      2     4        yes
9      2     2        yes
10     3     2        no
11     3     6        no
12     3     2        no

Note that each row in group 1 and group 2 has a value of "yes" because a 1 occurred in that group, while each row in group 3 has a "no" because 1 never occurs in group 3.

I solved this using somewhat of a frankenstein solution, but I was hoping that there would be a much faster solution using dplyr or data.table.

x = dcast(df,group~value,value.var = "value")
vec = NULL
for(i in 1:nrow(x)){
  if(x$`1`[i] > 0){
    vec = c(vec,x$group[i])
  }
}
df$hasValue = ifelse(df$group %in% vec,"yes","no")

Upvotes: 0

Views: 236

Answers (1)

Dason
Dason

Reputation: 61933

You could get close using the ave function

ave(df1$value, df1$group, FUN = function(x){1 %in% x})

which you could then use ifelse on to convert to "yes", "no" if you insist on that.

df1$hasValue <- ifelse(ave(df1$value, df1$group, FUN = function(x){1 %in% x}), "yes", "no")

Upvotes: 1

Related Questions