Reputation: 13
I have large data set of stream chemistry for several streams for long periods of time (7-20 years worth of data). I want to obtain a monthly TOC value for every year for each site but there are times when there is only 1 TOC value for a given month while other months have 2 or more. When I use summarize (example below) I get NA's for instances with only 1 TOC value.
df_TOC <- df %>% group_by(Site,Year, Month) %>% summarise(AvgTOC=mean(TOC))
I would like some help on how to summarize the data with a condition statement something along the lines of:
If TOC has 2 or more values per month, then calculate the mean; if not then paste the single value for that month.
Site Month Year TOC
AAA Jan 1975 5
AAA Feb 1975 7
AAA Feb 1975 8
AAA March 1975 10
AAA March 1975 12
AAA April 1975 13
BBB Jan 1975 6
BBB Jan 1975 3
BBB Feb 1975 6
BBB Feb 1975 12
BBB Feb 1975 17
BBB March 1975 4
Upvotes: 0
Views: 1739
Reputation: 3194
I'm a big fan of the base R function aggregate
and from the comments it seems you may have NA values that are leading to the NAs - as the mean of a single value should simply be that value. Try out:
aggregate(x = list(TOC_avg = myData$TOC), by = myData[,-4], mean, na.rm = T)
Where we use list()
to name the resulting aggregated value, and column 4 in the data is the original TOC column.
Upvotes: 0
Reputation: 3905
Your code does exactly what you want already. The mean value of a single value is the value itself.
myData = read.table(textConnection("Site Month Year TOC
AAA Jan 1975 5
AAA Feb 1975 7
AAA Feb 1975 8
AAA March 1975 10
AAA March 1975 12
AAA April 1975 13
BBB Jan 1975 6
BBB Jan 1975 3
BBB Feb 1975 6
BBB Feb 1975 12
BBB Feb 1975 17
BBB March 1975 4"), header = TRUE)
library(dplyr)
df_TOC = myData %>% group_by(Site,Year, Month) %>% summarise(AvgTOC=mean(TOC))
> df_TOC
## A tibble: 7 x 4
## Groups: Site, Year [?]
# Site Year Month AvgTOC
# <fctr> <int> <fctr> <dbl>
#1 AAA 1975 April 13.00000 #<<<
#2 AAA 1975 Feb 7.50000
#3 AAA 1975 Jan 5.00000 #<<< These ones come from an single sample each
#4 AAA 1975 March 11.00000
#5 BBB 1975 Feb 11.66667
#6 BBB 1975 Jan 4.50000
#7 BBB 1975 March 4.00000 #<<<
Upvotes: 1