Reputation: 584
I have a data frame with two columns:
df = data.frame(animals = c("cat; dog; bird", "dog; bird", "bird"), sentences = c("the cat is brown; the dog is barking; the bird is green and blue","the dog is black; the bird is yellow and blue", "the bird is blue"), stringsAsFactors = F)
I'd need the sum of the occurrences of all the "animals" on each row in the entire "sentences" column.
For example: "animals" first row c("cat; dog; bird") = sum_occurrences_sentences_column (cat = 1) + (dog = 2) + (bird = 3) = 6 .
The result will be a third column like this:
df <- cbind( sum_accurrences_sentences_column = c("6", "5", "3"), df)
I have tried the following codes but they do not work.
df[str_split(df$animals, ";") %in% df$sentences, ]
str_count(df$sentences, str_split(df$animals, ";"))
Any help would be appreciated :)
Upvotes: 0
Views: 1841
Reputation: 35554
A map()
way to manipulate each animal piece in the first column.
library(tidyverse)
string <- unlist(str_split(df$sentences, ";"))
df %>% rowwise %>%
mutate(SUM = str_split(animals, "; ", simplify = T) %>%
map( ~ str_count(string, .)) %>%
unlist %>% sum)
# animals sentences SUM
# <chr> <chr> <int>
# 1 cat; dog; bird the cat is brown; the dog is barking; the bird... 6
# 2 dog; bird the dog is black; the bird is yellow and blue 5
# 3 bird the bird is blue 3
Upvotes: 1
Reputation: 2640
Here's a base R
solution:
First remove all the ;
with gsub
, then split the sentences column and unlist
it into a vector:
split_sentence_column = unlist(strsplit(gsub(';','',df$sentences),' '))
Then set up a for loop and for each row get a vector of the animals, check which of the sentence column animals are in the animal list with %in%
, then sum all the TRUE
cases. We can then assign this to a new df column directly:
for(i in 1:nrow(df)){
animals = unlist(strsplit(df$animals[i], '; '))
df$sum_occurrences_sentences_column[i] = sum(split_sentence_column %in% animals)
}
> df
animals sentences sum_occurrences_sentences_column
1 cat; dog; bird the cat is brown; the dog is barking; the bird is green and blue 6
2 dog; bird the dog is black; the bird is yellow and blue 5
3 bird the bird is blue 3
Upvotes: 3