Reputation: 103
Suppose I have a data frame with columns c1, ..., cn, and a function f that takes in the columns of this data frame as arguments. How can I apply f to each row of the data frame to get a new data frame?
For example,
x = data.frame(letter=c('a','b','c'), number=c(1,2,3))
# x is
# letter | number
# a | 1
# b | 2
# c | 3
f = function(letter, number) { paste(letter, number, sep='') }
# desired output is
# a1
# b2
# c3
How do I do this? I'm guessing it's something along the lines of {s,l,t}apply(x, f), but I can't figure it out.
Upvotes: 10
Views: 11501
Reputation: 60756
as @greg points out, paste() can do this. I suspect your example is a simplification of a more general problem. After struggling with this in the past, as illustrated in this previous question, I ended up using the plyr package for this type of thing. plyr does a LOT more, but for these things it's easy:
> require(plyr)
> adply(x, 1, function(x) f(x$letter, x$number))
X1 V1
1 1 a1
2 2 b2
3 3 c3
you'll want to rename the output columns, I'm sure
So while I was typing this, @joshua showed an alternative method using ddply
. The difference in my example is that adply
treats the input data frame as an array. adply
does not use the "group by" variable row
that @joshua created. How he did it is exactly how I was doing it until Hadley tipped me to the adply()
approach. In the aforementioned question.
Upvotes: 11
Reputation: 176718
I think you were thinking of something like this, but note that the apply
family of functions do not return data.frames. They will also attempt to coerce your data.frame to a matrix before applying the function.
apply(x,1,function(x) paste(x,collapse=""))
So you may be more interested in ddply
from the plyr
package.
> x$row <- 1:NROW(x)
> ddply(x, "row", function(df) paste(df[[1]],df[[2]],sep=""))
row V1
1 1 a1
2 2 b2
3 3 c3
Upvotes: 1