Reputation: 75
I am trying to apply a function over each row of a DataFrame as the code shows.
using RDatasets
iris = dataset("datasets", "iris")
function mean_n_var(x)
mean1=mean([x[1], x[2], x[3], x[4]])
var1=var([x[1], x[2], x[3], x[4]])
rst=[mean1, var1]
return rst
end
mean_n_var([2,4,5,6])
for row in eachrow(iris[1:4])
println(mean_n_var(convert(Array, row)))
end
However, instead of printing results, I'd like to save them in an array or another DataFrame.
Thanks in advance.
Upvotes: 4
Views: 761
Reputation: 69899
I thought it is worth to mention some more options available over what was already mentioned.
I assume you want a Matrix
or a DataFrame
. There are several possible approaches.
First is the most direct to get a Matrix
:
mean_n_var(a) = [mean(a), var(a)]
hcat((mean_n_var(Array(x)) for x in eachrow(iris[1:4]))...) # rows
vcat((mean_n_var(Array(x)).' for x in eachrow(iris[1:4]))...) # cols
another possible approach is vectorized, e.g.:
mat_iris = Matrix(iris[1:4])
mat = hcat(mean(mat_iris, 2), var(mat_iris, 2))
df = DataFrame([vec(f(mat_iris, 2)) for f in [mean,var]], [:mean, :var])
DataFrame(mat) # this constructor also accepts variable names on master but is not released yet
Upvotes: 3