Reputation: 940
So I am trying to remove the stopwords of a vector of 318591 strings.
By doing this I am using this
X<-lapply(articles_and_id[,2], function(x) {
t <- unlist(strsplit(x, " "))
t[t %nin% stopWords]
Where my strings get split, and end up in a list looking like this:
>X[[1]]
[[1]]
[1] "new" "relictual" "highly" "troglomorphic" "species" "tomoceridae" "collembola"
[8] "deep" "croatian" "cave"
So I want to put it back into a dataframe transforming it into the following form:
1 new, relictual, highly, troglomorphic, species, tomoceridae, collembola, deep, croatian, cave
for which I am using:
articles_and_id[,2] <- lapply(X,toString)
But it is just endless!!!!
Any suggestions on how to improve this? If I stop the run
Upvotes: 1
Views: 32
Reputation: 19544
You can use :
articles_and_id[,2] <- sapply(X,paste, collapse=" ")
Upvotes: 1