Ran Tao
Ran Tao

Reputation: 311

Tag text using grep and paste in r

I have two data frames. The first one:

keyword <- c("apple","peach","grape","berry","kiwi fruit")
keyword <- data.frame(keyword)

enter image description here

The second one:

sentence <- c("I like apple","I hate apple","grape is good")
url <- c("url1","url2","url3")
sentence <- data.frame(sentence,url)

enter image description here

What I need to is: if keyword is contained in sentence, paste url to the text. If multiple sentences contain the keyword, paste all url. The final result is like:

enter image description here

I tried to use the code as bellow, but it did not work out as expected.

keyword$Label <- character(length(keyword$keyword))

for (i in 1:length(keyword$keyword)) {
keyword$Label[grep(keyword$keyword[i],sentence$sentence)] <- sentence$url
}

Upvotes: 1

Views: 200

Answers (1)

acylam
acylam

Reputation: 18681

A solution with stringr + dplyr + tidyr:

library(stringr)
library(dplyr)
library(tidyr)

sentence %>%
  mutate(sentence = str_extract(sentence, paste0(keyword$keyword, collapse = "|"))) %>%
  right_join(keyword, by = c("sentence" = "keyword")) %>%
  group_by(sentence) %>%
  mutate(URL = 1:n()) %>%
  spread(URL, url, sep = "") %>%
  rename(keyword = sentence)

Result:

# A tibble: 5 x 3
# Groups:   keyword [5]
     keyword  URL1  URL2
*      <chr> <chr> <chr>
1      apple  url1  url2
2      berry  <NA>  <NA>
3      grape  url3  <NA>
4 kiwi fruit  <NA>  <NA>
5      peach  <NA>  <NA>

Data:

keyword <- c("apple","peach","grape","berry","kiwi fruit")
keyword <- data.frame(keyword, stringsAsFactors = FALSE)
sentence <- c("I like apple","I hate apple","grape is good")
url <- c("url1","url2","url3")
sentence <- data.frame(sentence,url, stringsAsFactors = FALSE)

Upvotes: 2

Related Questions