Reputation: 3
I have a dataset with two string variables. Both contains sentences I want to compare word by word. I want to create a new column ("new_var") which should look like this:
var1 var2 new_var
"sentence numer one" "setence numer two" sentence:setence + one:two
"another one is here" "aner one are hre" another:aner + is:are + here:hre
I don't know how to write a code that will works on a dataset: add new column based on conditions and loop. My code works only when I defined objects var1 and var2 like it is.
library(stringr)
var1 = "this is sentence numer one"
var2 = "this is setence numer two"
new_var <- for (i in 1:(lengths(gregexpr("\\s+", var1)) + 1)) {
if (word(string = var1, start = i, end = i) != word(string=var2, start=i, end=i))
{
cat(word(string = var1, start = i, end = i), word(string = var2, start = i, end = i), "+", sep=":")
} else {
cat("")
}
}
Upvotes: 0
Views: 644
Reputation: 11981
one possibility would be to use str_split
and then map2
from the purrr
package.
First I create some pseuda data:
x <- c("sentence number one", "another one is here")
y <- c("setence number two", "aner one are hre")
Then I transform it:
x2 <- str_split(x, " ")
y2 <- str_split(y, " ")
library(purrr)
map2(x2, y2, ~ifelse(.x == .y, "", paste(.x, .y, sep = ":")))
[[1]]
[1] "sentence:setence" "" "one:two"
[[2]]
[1] "another:aner" "" "is:are" "here:hre"
Upvotes: 1