Reputation: 1148
I have little problem with apply
function in R.
I have a data frame - "bakery":
head(bakery)
Day.of.Week White Wheat Multigrain Black Cinnamon.Raisin Sour.Dough.French Light.Oat
1 5 436 456 417 311 95 96 224
2 6 653 571 557 416 129 140 224
3 1 496 490 403 351 114 108 228
4 2 786 611 570 473 165 148 304
5 4 547 474 424 365 144 104 256
6 5 513 443 380 317 100 92 180
First column is coded day of week, all other shows amount of different sorts of bread, which were sold in particular day. My task is to create a new variable, where will be mean (for all types of bread) relatively to every day of week. I did it using this command:
x12 <- 0
for (i in 2:8) {
x12<-x12+tapply(bakery[, i], bakery[, 1], mean)
}
x12
# 1 2 4 5 6
# 2190 3057 2314 2030 2690
Can I do the same using apply
or sapply
function?
Upvotes: 1
Views: 110
Reputation: 3488
Using dplyr
bakery %>%
group_by(Day.of.Week) %>%
summarise_each(funs(mean))
Day.of.Week White Wheat Multigrain Black Cinnamon.Raisin Sour.Dough.French Light.Oat
1 1 496.0 490.0 403.0 351 114.0 108 228
2 2 786.0 611.0 570.0 473 165.0 148 304
3 4 547.0 474.0 424.0 365 144.0 104 256
4 5 474.5 449.5 398.5 314 97.5 94 202
5 6 653.0 571.0 557.0 416 129.0 140 224
Of if you're looking for a total amount of break sold, per day:
bakery %>%
mutate(SumVar=rowSums(.[-1])) %>%
group_by(Day.of.Week) %>%
select(Day.of.Week,SumVar) %>%
summarise_each(funs(mean))
Day.of.Week SumVar
1 1 2190
2 2 3057
3 4 2314
4 5 2030
5 6 2690
FIXED so that rowSums doesn't add in the day to the sum.
Upvotes: 0
Reputation: 13314
Solution based on data.table
:
library(data.table)
setDT(bakery)[,.(mean=mean(rowSums(.SD))),by=Day.of.Week]
# Day.of.Week mean
# 1: 5 2030
# 2: 6 2690
# 3: 1 2190
# 4: 2 3057
# 5: 4 2314
Upvotes: 0
Reputation: 13122
Also:
rowsum(bakery[-1], bakery[[1]]) / table(bakery[[1]])
# White Wheat Multigrain Black Cinnamon.Raisin Sour.Dough.French Light.Oat
#1 496.0 490.0 403.0 351 114.0 108 228
#2 786.0 611.0 570.0 473 165.0 148 304
#4 547.0 474.0 424.0 365 144.0 104 256
#5 474.5 449.5 398.5 314 97.5 94 202
#6 653.0 571.0 557.0 416 129.0 140 224
rowSums(rowsum(bakery[-1], bakery[[1]]) / table(bakery[[1]]))
# 1 2 4 5 6
#2190 3057 2314 2030 2690
Where:
bakery = structure(list(Day.of.Week = c(5L, 6L, 1L, 2L, 4L, 5L), White = c(436L,
653L, 496L, 786L, 547L, 513L), Wheat = c(456L, 571L, 490L, 611L,
474L, 443L), Multigrain = c(417L, 557L, 403L, 570L, 424L, 380L
), Black = c(311L, 416L, 351L, 473L, 365L, 317L), Cinnamon.Raisin = c(95L,
129L, 114L, 165L, 144L, 100L), Sour.Dough.French = c(96L, 140L,
108L, 148L, 104L, 92L), Light.Oat = c(224L, 224L, 228L, 304L,
256L, 180L)), .Names = c("Day.of.Week", "White", "Wheat", "Multigrain",
"Black", "Cinnamon.Raisin", "Sour.Dough.French", "Light.Oat"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))
Upvotes: 1
Reputation: 206566
Because you want to group by day of the week, tapply
would be a good choice here. You can do
tapply(rowSums(bakery[,-1]), factor(bakery[,1]), mean)
because in this case the mean of the sums should be the same as the sum of the means. It's not easy to test because your sample result does not seem to match your test data (there are rows with Day.of.week. 7)
Upvotes: 2