Reputation: 343
I'm not sure that I understand the different outputs in these two scenarios:
(1)
pioneers <- c("GAUSS:1777", "BAYES:1702", "PASCAL:1623", "PEARSON:1857")
split <- strsplit(pioneers, split = ":")
split
(2)
pioneers <- c("GAUSS:1777", "BAYES:1702", "PASCAL:1623", "PEARSON:1857")
split <- lapply(pioneers, strsplit, split = ":")
split
In both cases, the output is a list but I'm not sure when I'd use the one notation (simply applying a function to a vector) or the other (using lapply to loop the function over the vector).
Thanks for the help.
Greg
Upvotes: 0
Views: 156
Reputation: 3284
To me it's to do with how the output is returned. [l]apply
stands for list apply - i.e. the output is returned as a list. strsplit
already returns a list as, if there were multiple :
s in your pioneers
vector, it's the only data structure that makes sense - i.e. a list element of each of the 4 elements of the vector and each list element contains a vector of the split string.
So using lapply(x, strsplit, ...)
will always return a list inside a list, which you probably don't want in this case.
Using lapply
is useful in cases where you expect the result of the function you're applying to be a vector of an undefined or variable length. As strsplit
can see this coming already, the use of lapply
is redundant, so you should probably know what form you expect/want your answer to be in, and use the appropriate functions to coerce the output in to the right data structure.
To make clear, the output of the examples you gave is not the same. One is a list, one is a list of lists. The identical result would be
lapply(pioneers, function(x, split) strsplit(x, split)[[1]], split = ":")
i.e. taking the first list element of the inner list (which is only 1 element anyway) in each case.
Upvotes: 2