p c
p c

Reputation: 33

Finding strings in a dataframe with vector of strings

Good evening,

I have the following dataframe :

df <- data.frame(I(list(c("nugget de blé","boeuf"),"nugget de blé")))
list_dishes <- c("nugget de blé")

I want to identify the df's dishes in list_dishes: counting 1 when a dishes is identified.

I wrote the following algorithm but it doesn't word. Basically the algorithm identify only when "nugget the blé" is alone.

classification <- function(X,list){
  compteur = 0
  if(length(which(X%in%list)>=1){compteur = compteur + 1}
  return(compteur)
}

results <- data.frame(apply(df, c(1,2), function(x) calcul(x,list_dishes)))

Can you help me please.

Thanks

Upvotes: 3

Views: 155

Answers (2)

akrun
akrun

Reputation: 886948

We can use sapply with grepl from base R

df$found_dishes <- sapply(df[[1]], function(x) any(grepl(list_dishes, x)))

Upvotes: 1

Joao Pedro Macalos
Joao Pedro Macalos

Reputation: 378

If you don't mind using the tidyverse packages, here is one solution:

library(tidyverse)

# I added a name to the column of your data.frame
df <- data.frame(a = I(list(c("nugget de blé","boeuf"), 
                            "nugget de blé")))
list_dishes <- c("nugget de blé")

tibble(df) %>%
  mutate(id = row_number()) %>%
  rowwise() %>%
  mutate(found_dishes = map(a, ~str_detect(.x, list_dishes))) %>%
  unnest(found_dishes) %>%
  filter(found_dishes == T)

#> A tibble: 2 x 3
#> a            id found_dishes
#> <I<list>> <int> <lgl>       
#>1  <chr [2]>   1 TRUE        
#>2  <chr [1]>   2 TRUE    

Then you count the number of rows to find how many matches you found.

Upvotes: 1

Related Questions