Reputation: 53
I'm trying to count the number of times each row value in df1's Column A appears in df2's Column B, then populate the corresponding counts in a new column in df1.
df1
ColumnA
A_10
A_11
A_12
A_13
A_14
df2
ColumnB
A_10, A_50
A_10, A_50
A_14, A_50
A_50, A_10
Desired output
ColumnA New_Column
A_10 3
A_11 0
A_12 0
A_13 0
A_14 1
My Code:
df1$new_column <- 0
for (i in df1$ColumnA) {
df$new_column[i] <- sum(grepl(i, df2$ColumnB), na.rm = TRUE)
}
Been cracking my head the whole day, including looking at forum questions - What am I doing wrong here?
Upvotes: 0
Views: 31
Reputation: 388797
We can split the data in df2
on comma, count
each individual entry and join it with df1
. We replace the NA
values with 0.
library(dplyr)
library(tidyr)
df2 %>%
separate_rows(ColumnB, sep = ',\\s*') %>%
count(ColumnB) %>%
right_join(df1, by = c('ColumnB' = 'ColumnA')) %>%
mutate(n = replace_na(n, 0))
# ColumnB n
# <chr> <dbl>
#1 A_10 3
#2 A_14 1
#3 A_11 0
#4 A_12 0
#5 A_13 0
and using grepl
:
df1$New_Column <- sapply(df1$ColumnA, function(x) sum(grepl(x, df2$ColumnB)))
Upvotes: 1