Reputation: 199
I am relatively new R user and I have a query. I have 2 dataframes as shown below:
df1 <- data.frame(
OrderNo = c("ORD-34212", "ORD-12252", "ORD-78564", "ORD-56438",
"ORD-13198", "ORD-12258", "ORD-34567"),
Status = c("Delivered", "Not delivered", "Not delivered",
"Delivered", "Not delivered", "Delivered", "Delivered"),
stringsAsFactors = FALSE
)
df2 <- data.frame(
Description = c("ORD-34212 was delivered but not ORD-12252", "ORD-78564 was placed on 21/2/2018",
"my order is ORD-56438", "ORD-56438 and ORD-12258", "ORD-34212 is from Europe"),
DB = c("updated", "updated", "not updated", "not updated", "updated"),
stringsAsFactors = FALSE
)
I want to be able to count the occurrences of each "OrderNo" mentioned in df1 from the column "Description" in df2. I know I have to use str_count but I am unable to generate the logic for the same.
I want the output df1 to look like this with count and a flag if orderno was found:
Can you please help out here? Thanks in advance!
Upvotes: 1
Views: 80
Reputation: 2283
A dplyr solution:
library(dplyr)
df1 %>% rowwise() %>% mutate(Count = sum(grepl(OrderNo,df2$Description)), FoundInSecondDF = any(grepl(OrderNo,df2$Description)))
# Source: local data frame [7 x 4]
# Groups: <by row>
#
# A tibble: 7 x 4
# OrderNo Status Count FoundInSecondDF
# <chr> <chr> <int> <lgl>
# 1 ORD-34212 Delivered 2 T
# 2 ORD-12252 Not delivered 1 T
# 3 ORD-78564 Not delivered 1 T
# 4 ORD-56438 Delivered 2 T
# 5 ORD-13198 Not delivered 0 F
# 6 ORD-12258 Delivered 1 T
# 7 ORD-34567 Delivered 0 F
Upvotes: 2