Reputation: 527
Is it possible to replace a for loop like this:
library(quanteda)
library(quanteda.dictionaries)
#dummy data
df <- data.frame(text = c("Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.", "Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.", "Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown pr
inter took a galley of type and scrambled it to make a type specimen book."))
for (j in 1:nrow(df)) {
out <- liwcalike(df$text[j],
dictionary = data_dictionary_NRC)
dfm <- rbind(dfm, data.frame(em1 = out$trust, em2= out$anger))
}
with lapply or anything else to reduce execution time?
Upvotes: 1
Views: 173
Reputation: 107567
Build a list of data frames and rbind once outside loop and avoid the quadractic copying with rbind
inside loop:
df_list <- lapply(df$text, function(txt) {
out <- liwcalike(txt, dictionary = data_dictionary_NRC)
return(data.frame(em1 = out$trust, em2= out$anger, origin=txt))
}
final_df <- do.call(rbind, df_list)
In case of any issues with liwcalike
call, wrap process in tryCatch
to return NA
-row data frame on any errors:
df_list <- lapply(df$text, function(txt) {
tryCatch({
out <- liwcalike(txt, dictionary = data_dictionary_NRC)
return(data.frame(em1=out$trust, em2=out$anger, origin=txt, error=NA))
}, error = function(e)
data.frame(em1=NA, em2=NA, origin=txt, error=e)
)
}
final_df <- do.call(rbind, df_list)
Upvotes: 1