Ahmet Atilla Colak
Ahmet Atilla Colak

Reputation: 79

How can I loop over a tibble and change its column values based on string detection?

I have a function where I try to loop through the given tibble with pre-determined columns.

Code looks like below:

secenekler <- c("Marka ve üretici firma", "Son kullanma tarihi", "Enerji ve besin ögeleri", "Üretim tarihi", "Sağlık bilgisi", "Diğer", "Ürünün içeriği")

##
categorizer <- function(tibble) {

for (i in 1:nrow(tibble)) {
  for (i1 in 1:length(secenekler)) {
    sutun_ismi <- secenekler[i1]
    if (str_detect(tibble$text[i], sutun_ismi)) {
      gruplanacak_veri_seti[,sutun_ismi] == 1 
    }
    
    else {
      gruplanacak_veri_seti[,sutun_ismi] == 0
    }
}  
}
return(gruplanacak_veri_seti)
}  

The 'text' column is a text, and I want to alter the values in other columns based on whether the 'text' column for each row contains certain 'secenekler' values. For instance, if for third row 'text' column doesn't contain the second value of the 'secenekler vector', a column value for the specific row in the given tibble will be 0.

I already hvae the tibble that I want to use this function on. However, when I did it, nothing changed.

What am I missing or doing wrong?

P.S: below is the summary of my tibble.


Rows: 304
Columns: 9
$ Sno                       <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16…
$ dikkat_edilen_bilgiler    <chr> "Marka ve üretici firma, Ürünün içeriği, Son kullanma…
$ `Marka ve üretici firma`  <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ `Ürünün içeriği`          <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ `Üretim tarihi`           <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ `Son kullanma tarihi`     <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ Diğer                     <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ `Enerji ve besin ögeleri` <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ `Sağlık bilgisi`          <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…


Upvotes: 0

Views: 518

Answers (1)

Skaqqs
Skaqqs

Reputation: 4140

In your function, you are looping over each row in a dataframe (tibble), within that row you're looping over your columns of interest (secenekler), and within each row/column of interest you're comparing its value to a value in a reference column (df$text).

Rather than looping over rows, you can utilize ifelse() which is vectorized, ie operates on all rows of a target object at once. And because you've identified your column names by their exact name, there's no need to do any pattern matching.

My example below operates on one column at a time, comparing that column's values to the values in the reference column (tibble$text).

secenekler <- c("Marka ve üretici firma", "Son kullanma tarihi",
                "Enerji ve besin ögeleri","Üretim tarihi",
                "Sağlık bilgisi", "Diğer", "Ürünün içeriği")

for(var in secenekler){
    tibble[,var] <- ifelse(tibble$text == var, yes = 1, no = 0)
}

Upvotes: 1

Related Questions