Reputation: 1255
I have a data frame like this
V1 V2 V3 V4 month year
1 -1 9 1 1 1989
1 -1 9 1 1 1989
4 -1 9 1 2 1989
3 2 7 3 1 1990
4 4 8 2 2 1990
3 6 9 2 2 1990
4 7 0 2 2 1990
5 8 4 2 2 1990
where the first 4 rows indicate the value of the quantity A in the cell 1,2,3,4 and the last two columns give the month and the year. What I would like to do is to calculate the monthly average of A for every cell and so to end up with a list
V1
1989
<A>jen <A>feb ..
1 4
1990
<A>jen <A>feb ..
3 4
V2
V3
Many thanks
Upvotes: 0
Views: 92
Reputation: 193687
I was still hoping for something a little bit more precise in your question as to what your desired output is exactly, but since you haven't updated that part, I'll provide two options.
aggregate
seems to be a pretty straightforward tool for this task, particularly if sticking with a "wide" format would be fine for your needs.
aggregate(. ~ year + month, mydf, mean)
# year month V1 V2 V3 V4
# 1 1989 1 1 -1.00 9.00 1
# 2 1990 1 3 2.00 7.00 3
# 3 1989 2 4 -1.00 9.00 1
# 4 1990 2 4 6.25 5.25 2
If you prefer your data in a "long" format, you should explore the "reshape2" package which can handle the reshaping and aggregating in just a few steps.
library(reshape2)
mydfL <- melt(mydf, id.vars = c("year", "month"))
## The next step is purely cosmetic...
mydfL$month <- factor(month.abb[mydfL$month], month.abb, ordered = TRUE)
head(mydfL)
# year month variable value
# 1 1989 Jan V1 1
# 2 1989 Jan V1 1
# 3 1989 Feb V1 4
# 4 1990 Jan V1 3
# 5 1990 Feb V1 4
# 6 1990 Feb V1 3
## This is the actual aggregation and reshaping step...
out <- dcast(mydfL, variable + year ~ month,
value.var = "value", fun.aggregate = mean)
out
# variable year Jan Feb
# 1 V1 1989 1 4.00
# 2 V1 1990 3 4.00
# 3 V2 1989 -1 -1.00
# 4 V2 1990 2 6.25
# 5 V3 1989 9 9.00
# 6 V3 1990 7 5.25
# 7 V4 1989 1 1.00
# 8 V4 1990 3 2.00
Upvotes: 1