Slavka
Slavka

Reputation: 1148

Problems with using apply function

I have little problem with apply function in R. I have a data frame - "bakery":

head(bakery)
  Day.of.Week White Wheat Multigrain Black Cinnamon.Raisin Sour.Dough.French Light.Oat
1           5   436   456        417   311              95                96       224
2           6   653   571        557   416             129               140       224
3           1   496   490        403   351             114               108       228
4           2   786   611        570   473             165               148       304
5           4   547   474        424   365             144               104       256
6           5   513   443        380   317             100                92       180

First column is coded day of week, all other shows amount of different sorts of bread, which were sold in particular day. My task is to create a new variable, where will be mean (for all types of bread) relatively to every day of week. I did it using this command:

x12 <- 0
for (i in 2:8) {
           x12<-x12+tapply(bakery[, i], bakery[, 1], mean)
           }
x12
#    1    2    4    5    6 
# 2190 3057 2314 2030 2690 

Can I do the same using apply or sapply function?

Upvotes: 1

Views: 110

Answers (4)

Andrew Taylor
Andrew Taylor

Reputation: 3488

Using dplyr

bakery %>%
  group_by(Day.of.Week) %>%
  summarise_each(funs(mean))

  Day.of.Week White Wheat Multigrain Black Cinnamon.Raisin Sour.Dough.French Light.Oat
1           1 496.0 490.0      403.0   351           114.0               108       228
2           2 786.0 611.0      570.0   473           165.0               148       304
3           4 547.0 474.0      424.0   365           144.0               104       256
4           5 474.5 449.5      398.5   314            97.5                94       202
5           6 653.0 571.0      557.0   416           129.0               140       224

Of if you're looking for a total amount of break sold, per day:

bakery %>%
  mutate(SumVar=rowSums(.[-1])) %>%
  group_by(Day.of.Week) %>%
  select(Day.of.Week,SumVar) %>%
  summarise_each(funs(mean))

  Day.of.Week SumVar
1           1   2190
2           2   3057
3           4   2314
4           5   2030
5           6   2690

FIXED so that rowSums doesn't add in the day to the sum.

Upvotes: 0

Marat Talipov
Marat Talipov

Reputation: 13314

Solution based on data.table:

library(data.table)

setDT(bakery)[,.(mean=mean(rowSums(.SD))),by=Day.of.Week]

#    Day.of.Week mean
# 1:           5 2030
# 2:           6 2690
# 3:           1 2190
# 4:           2 3057
# 5:           4 2314

Upvotes: 0

alexis_laz
alexis_laz

Reputation: 13122

Also:

rowsum(bakery[-1], bakery[[1]]) / table(bakery[[1]])
#  White Wheat Multigrain Black Cinnamon.Raisin Sour.Dough.French Light.Oat
#1 496.0 490.0      403.0   351           114.0               108       228
#2 786.0 611.0      570.0   473           165.0               148       304
#4 547.0 474.0      424.0   365           144.0               104       256
#5 474.5 449.5      398.5   314            97.5                94       202
#6 653.0 571.0      557.0   416           129.0               140       224

rowSums(rowsum(bakery[-1], bakery[[1]]) / table(bakery[[1]]))
#   1    2    4    5    6 
#2190 3057 2314 2030 2690

Where:

bakery = structure(list(Day.of.Week = c(5L, 6L, 1L, 2L, 4L, 5L), White = c(436L, 
653L, 496L, 786L, 547L, 513L), Wheat = c(456L, 571L, 490L, 611L, 
474L, 443L), Multigrain = c(417L, 557L, 403L, 570L, 424L, 380L
), Black = c(311L, 416L, 351L, 473L, 365L, 317L), Cinnamon.Raisin = c(95L, 
129L, 114L, 165L, 144L, 100L), Sour.Dough.French = c(96L, 140L, 
108L, 148L, 104L, 92L), Light.Oat = c(224L, 224L, 228L, 304L, 
256L, 180L)), .Names = c("Day.of.Week", "White", "Wheat", "Multigrain", 
"Black", "Cinnamon.Raisin", "Sour.Dough.French", "Light.Oat"), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6"))

Upvotes: 1

MrFlick
MrFlick

Reputation: 206566

Because you want to group by day of the week, tapply would be a good choice here. You can do

tapply(rowSums(bakery[,-1]), factor(bakery[,1]), mean)

because in this case the mean of the sums should be the same as the sum of the means. It's not easy to test because your sample result does not seem to match your test data (there are rows with Day.of.week. 7)

Upvotes: 2

Related Questions