Elr Mant
Elr Mant

Reputation: 527

Replace for loop with lapply

Is it possible to replace a for loop like this:

library(quanteda)
library(quanteda.dictionaries)

#dummy data
df <- data.frame(text = c("Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.", "Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.", "Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown pr

    inter took a galley of type and scrambled it to make a type specimen book."))

    for (j in 1:nrow(df)) {
        out <- liwcalike(df$text[j], 
                            dictionary = data_dictionary_NRC)
        dfm <- rbind(dfm, data.frame(em1 = out$trust, em2= out$anger))
    }

with lapply or anything else to reduce execution time?

Upvotes: 1

Views: 173

Answers (1)

Parfait
Parfait

Reputation: 107567

Build a list of data frames and rbind once outside loop and avoid the quadractic copying with rbind inside loop:

df_list <- lapply(df$text, function(txt) {
               out <- liwcalike(txt, dictionary = data_dictionary_NRC)
               return(data.frame(em1 = out$trust, em2= out$anger, origin=txt))
           }

final_df <- do.call(rbind, df_list)

In case of any issues with liwcalike call, wrap process in tryCatch to return NA-row data frame on any errors:

df_list <- lapply(df$text, function(txt) {
               tryCatch({
                   out <- liwcalike(txt, dictionary = data_dictionary_NRC)
                   return(data.frame(em1=out$trust, em2=out$anger, origin=txt, error=NA))
               }, error = function(e) 
                   data.frame(em1=NA, em2=NA, origin=txt, error=e)
               )
           }

final_df <- do.call(rbind, df_list)

Upvotes: 1

Related Questions