Reputation: 889
I have a function that was suggested by a user as an aswer to my previous question:
word_string <- function(x) {
inds <- seq_len(nchar(x))
start = inds[-length(inds)]
stop = inds[-1]
substring(x, start, stop)
}
The function works as expected and breaks down a given word into component parts as per my sepcifications:
word_string('microwave')
[1] "mi" "ic" "cr" "ro" "ow" "wa" "av" "ve"
What I now want to be able to do is have the function applied to all rows of a specified columnin a dataframe.
Here's dataframe for purposes of illustration:
word <- c("House", "Motorcar", "Boat", "Dog", "Tree", "Drink")
some_value <- c("2","100","16","999", "65","1000000")
my_df <- data.frame(word, some_value, stringsAsFactors = FALSE )
my_df
word some_value
1 House 2
2 Motorcar 100
3 Boat 16
4 Dog 999
5 Tree 65
6 Drink 1000000
Now, if I use lapply to work the function on my dataframe, not only do I get incorrect results but also an error message.
lapply(my_df['word'], word_string)
$word
[1] "Ho" "ot" "at" "" "Tr" "ri"
Warning message:
In seq_len(nchar(x)) : first element used of 'length.out' argument
So you can see that the function is being applied, but it's being applied such that it's evaluating each row partially. The desired output would be something like:
[1] "ho" "ou" "us" "se
[2] "mo" "ot" "to" "or" "rc" "ca" "ar"
[3] "bo" "oa" "at"
[4] "do" "og"
[5] "tr" "re" "ee"
[6] "dr" "ri" "in" "nk"
Any guidance greatly appreciated.
Upvotes: 1
Views: 64
Reputation: 887901
The reason is that [
is still a data.frame with one column (if we don't use ,
) and so here the unit is a single column.
str(my_df['word'])
'data.frame': 6 obs. of 1 variable:
# $ word: chr "House" "Motorcar" "Boat" "Dog" ...
The lapply
loops over that single column instead of each of the elements in that column.
W need either $
or [[
lapply(my_df[['word']], word_string)
#[[1]]
#[1] "Ho" "ou" "us" "se"
#[[2]]
#[1] "Mo" "ot" "to" "or" "rc" "ca" "ar"
#[[3]]
#[1] "Bo" "oa" "at"
#[[4]]
#[1] "Do" "og"
#[[5]]
#[1] "Tr" "re" "ee"
#[[6]]
#[1] "Dr" "ri" "in" "nk"
Upvotes: 2