Reputation: 846
I have a data.frame
:
df_1 <- data.frame(
x = replicate(
n = 3, expr = rnorm(
n = 30, mean = 100, sd = 10
)
),
y = sample(x = 1:3, size = 30, replace = TRUE)
)
And the follow list
:
lt_1 <- split(
x = df_1,
f = df_1[['y']]
)
names(lt_1) <- paste('df', seq_along(lt_1), sep = '_')
And the follow function:
library(magrittr)
for (i in lt_1[c(1)]) {
print(
x = cbind(i, var_1 = rowSums(i[, 1:2]),
var_2 = rowMeans(i[, 1:3]),
var_3 = multiply_by(i[1], i[2]))
)
}
The result is:
x.1 x.2 x.3 y var_1 var_2 x.1
7 104.87429 96.05710 81.95041 1 200.9314 94.29393 10073.920
9 105.75781 111.00025 101.53253 1 216.7581 106.09687 11739.144
12 103.89843 97.46638 92.90054 1 201.3648 98.08845 10126.604
14 77.85300 105.56663 90.65902 1 183.4196 91.35955 8218.679
16 99.55066 92.92505 102.91446 1 192.4757 98.46339 9250.750
21 109.18977 103.98106 94.31331 1 213.1708 102.49472 11353.668
29 95.21850 105.69720 103.70019 1 200.9157 101.53863 10064.328
Why the var_3
is x.1
in output?
Upvotes: 0
Views: 31
Reputation: 389065
That is because you are multiplying two dataframes and not vectors. To illustrate here is a short example :
head(cbind(mtcars, new_col = mtcars[1] * mtcars[2]))
One would expect new column with name new_col
in the data but you get
# mpg cyl disp hp drat wt qsec vs am gear carb mpg
#Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 126.0
#Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 126.0
#Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 91.2
#Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 128.4
#Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 149.6
#Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 108.6
mpg
because that is the column name of mtcars[1]
(Try, head(cbind(mtcars, new_col = mtcars[2] * mtcars[1]))
) which is multiplied first.
To avoid that get data as vectors. So
head(cbind(mtcars, new_col = mtcars[[1]] * mtcars[[2]]))
# mpg cyl disp hp drat wt qsec vs am gear carb new_col
#Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 126.0
#Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 126.0
#Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 91.2
#Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 128.4
#Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 149.6
#Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 108.6
Hence, in your for
loop use :
for (i in lt_1[c(1)]) {
print(
x = cbind(i, var_1 = rowSums(i[, 1:2]),
var_2 = rowMeans(i[, 1:3]),
var_3 = multiply_by(i[[1]], i[[2]]))
)
}
# x.1 x.2 x.3 y var_1 var_2 var_3
#1 89.510 93.741 113.766 1 183.25 99.006 8390.8
#2 94.791 90.991 98.196 1 185.78 94.660 8625.2
#3 116.232 106.637 84.323 1 222.87 102.398 12394.7
#4 89.299 103.003 97.393 1 192.30 96.565 9198.1
#10 86.656 101.626 118.714 1 188.28 102.332 8806.6
#13 106.344 103.055 93.797 1 209.40 101.065 10959.2
#15 107.936 104.104 97.580 1 212.04 103.207 11236.5
#16 98.476 101.837 111.175 1 200.31 103.829 10028.5
#19 92.650 111.762 101.930 1 204.41 102.114 10354.7
#21 106.193 90.544 100.071 1 196.74 98.936 9615.1
#23 93.143 104.520 90.227 1 197.66 95.963 9735.3
#28 96.806 104.856 92.445 1 201.66 98.036 10150.7
#29 100.845 97.343 97.360 1 198.19 98.516 9816.6
#30 92.315 101.516 92.475 1 193.83 95.436 9371.5
Upvotes: 1