Reputation: 41
I have a data table (Precip15) consisting of columns of precipitation, date of year (DOY), and Date_Time in POSIXct format. I need to be able to see the total precipitation (Rain_cm) for every day recorded. Any suggestions?
An example of the data table format looks like this:
DOY Rain Rain_cm Date_Time
179 6 0.6 2019-06-28 15:00:00
179 0 NA 2019-06-28 15:15:00
179 2 0.2 2019-06-28 16:45:00
180 0 NA 2019-06-29 10:00:00
180 10.2 1.2 2019-06-29 10:15:00
180 2 0.2 2019-06-29 13:00:00
I need it to look like this:
DOY Rain_cm
179 0.8
180 1.4
or possibly:
Date Rain_cm
2019-06-28 0.8
2019-06-29 1.4
Thanks in advance for any help!
Upvotes: 2
Views: 2746
Reputation: 269526
Here are some base R solutions using the data frame DF
defined reproducibly in the Note at the end. Solutions based on dplyr, data.table or zoo packages would be possible as well.
1) aggregate aggregate
on DOY
or on Date
(defined in the transform
statement below) depending on what you want. Note that aggregate
automatically removes rows with NAs.
aggregate(Rain_cm ~ DOY, DF, sum)
## DOY Rain_cm
## 1 179 0.8
## 2 180 1.4
DF2 <- transform(DF, Date = as.Date(Date_Time))
aggregate(Rain_cm ~ Date, DF2, sum)
## Date Rain_cm
## 1 2019-06-28 0.8
## 2 2019-06-29 1.4
2) rowsum Another base R solution is rowsum
returning a one column matrix with the row names being the value of the grouping variable. DF2
is from (1).
with(na.omit(DF), rowsum(Rain_cm, DOY))
## [,1]
## 179 0.8
## 180 1.4
with(na.omit(DF2), rowsum(Rain_cm, Date))
## [,1]
## 2019-06-28 0.8
## 2019-06-29 1.4
3) tapply Another base R approach is tapply
. This produces a named numeric vector. DF2
is from (1).
with(DF, tapply(Rain_cm, DOY, sum, na.rm = TRUE))
## 179 180
## 0.8 1.4
with(DF2, tapply(Rain_cm, Date, sum, na.rm = TRUE))
## 2019-06-28 2019-06-29
## 0.8 1.4
4) xtabs xtabs
can be used to form an xtabs
table
object. DF2
is from (1).
xtabs(Rain_cm ~ DOY, DF)
## DOY
## 179 180
## 0.8 1.4
xtabs(Rain_cm ~ Date, DF2)
## Date
## 2019-06-28 2019-06-29
## 0.8 1.4
The data in reproducible form is assumed to be:
Lines <- "DOY Rain Rain_cm Date_Time
179 6 0.6 2019-06-28 15:00:00
179 0 NA 2019-06-28 15:15:00
179 2 0.2 2019-06-28 16:45:00
180 0 NA 2019-06-29 10:00:00
180 10.2 1.2 2019-06-29 10:15:00
180 2 0.2 2019-06-29 13:00:00"
L <- readLines(textConnection(Lines))
DF <- read.csv(text = gsub(" +", ",", Lines))
Upvotes: 3
Reputation: 38
You can use the aggregate
and cut
functions to calculate your total daily precip values. The following code will provide you with the desired results:
precipTotals <- aggreate(df$Rain_cm ~ cut(df$Date_Time, breaks = "day"), x = df,
FUN = sum, na.rm = TRUE)
Make sure your precip columns are as.numeric()
and your Date_Time
is in as.POSIXct()
format and this will work for you.
Upvotes: 0
Reputation: 8117
df <- tribble(
~DOY, ~Rain, ~Rain_cm, ~Date_Time
, 179 , 6 , 0.6 , "2019-06-28 15:00:00"
, 179 , 0 , NA , "2019-06-28 15:15:00"
, 179 , 2 , 0.2 , "2019-06-28 16:45:00"
, 180 , 0 , NA , "2019-06-29 10:00:00"
, 180 , 10.2 , 1.2 , "2019-06-29 10:15:00"
, 180 , 2 , 0.2 , "2019-06-29 13:00:00"
)
df %>%
mutate(Date_Time = ymd_hms(Date_Time)) %>%
mutate(Date = as.Date(Date_Time)) %>%
group_by(Date) %>%
summarise(perDate = sum(Rain_cm, na.rm = TRUE))
Date perDate
<date> <dbl>
1 2019-06-28 0.8
2 2019-06-29 1.4
Upvotes: 0