RaiderNAYSHUN
RaiderNAYSHUN

Reputation: 61

How would I implement this countif in r?

My two dataframes:

 df1
    Col1 
     A
     B
     C

 df2
    Col1 
     A  
     D  
     E  
     F  

I would like to add a 2nd column, Col2, to df1 where each value in the column is 1 if it's respective value in Col1 is also in Col1 of df2. Thus df1 would look like this:

 df1
    Col1 Col2
     A    1
     B    0
     C    0

Thanks!

Upvotes: 1

Views: 134

Answers (3)

cderv
cderv

Reputation: 6542

Two options using data.table

First one uses %chin% operator :

library(data.table)
x = data.table(v = LETTERS[1:3])
y = data.table(v = c("A","D","E","F"))

x[, found:= v %chin% y$v]
x
#>    v found
#> 1: A  TRUE
#> 2: B FALSE
#> 3: C FALSE

The second one is built on merging behaviour:

library(data.table)
x = data.table(v = LETTERS[1:3])
y = data.table(v = c("A","D","E","F"))

y[, found := TRUE]
x[, found:= y[.SD, .(ifelse(is.na(found), FALSE, TRUE)), on  = .(v)]]
x
#>    v found
#> 1: A  TRUE
#> 2: B FALSE
#> 3: C FALSE

EDIT: Based on @frank comment, you could simplify with no ifelse - it is the same

x[, found:= y[.SD, !is.na(found), on  = .(v)]]
x
#>    v found
#> 1: A  TRUE
#> 2: B FALSE
#> 3: C FALSE

For understanding what happens, here is the inside code I built on:

x[, found := NULL]
y[x, on = .(v)]
#>    v found
#> 1: A  TRUE
#> 2: B    NA
#> 3: C    NA

Upvotes: 1

www
www

Reputation: 39154

df3 is the final output.

library(dplyr)

df1 <- data_frame(Col1 = c("A", "B", "C"))
df2 <- data_frame(Col1 = c("A", "D", "E", "F"))

df3 <- df1 %>% mutate(Col2 = as.numeric(Col1 %in% df2$Col1)) 

Or the following approach is similar to HarlandMason's method but using dplyr and tidyr.

library(dplyr)
library(tidyr)
df3 <- df2 %>%
  mutate(Col2 = 1) %>%
  right_join(df1, by = "Col1") %>%
  replace_na(list(Col2 = 0))

Upvotes: 1

HarlandMason
HarlandMason

Reputation: 789

Add the col2 to df2

df2$Col2 <- 1

Perform a left-join merge:

df3 <- merge(df1, df2, all.x=T, by='Col1')

Replace the NAs with zeros

df3$Col2[which(is.na(df3$Col2))] <- 0

df3 is now

  Col1 Col2
1    A    1
2    B    0
3    C    0

Edit: @ycw has done it more concisely using as.numeric and %in%. I like his answer, but I thought I'd edit mine to include a version of his work that doesn't use dplyr:

It's as simple as df1$Col2 <- as.numeric(df1$Col1 %in% df2$Col1). Much better than mine!

Upvotes: 2

Related Questions