Reputation: 11

Add consecutive temp values above threshold to create "degree hours"

I am working with a dataset of hourly temperatures and I need to calculate "degree hours" above a heat threshold for each extreme event. I intend to run stats on the intensities (combined magnitude and duration) of each event to compare multiple sites over the same time period.

Example of data:

        Temp 
1     14.026
2     13.714
3     13.25
.....
21189 12.437
21190 12.558
21191 12.703
21192 12.896

Data after selecting only hours above the threshold of 18 degrees and then subtracting 18 to reveal degrees above 18:

       Temp
5297  0.010
5468  0.010
5469  0.343
5470  0.081
5866  0.010
5868  0.319
5869  0.652

After this step I need help to sum consecutive hours during which the reading exceeded my specified threshold.

What I am hoping to produce out of above sample:

I've debated manipulating these data within a time series or by adding additional columns, but I do not want multiple rows for each warming event. I would immensely appreciate any advice.

Upvotes: 1

Answers (2)

Neal Fultz

Reputation: 9687

This is an alternative solution in base R.

You have some data that walks around, and you want to sum up the points above a cutoff. For example:

set.seed(99999)
x <- cumsum(rnorm(30))
plot(x, type='b')
abline(h=2, lty='dashed')

which looks like this:

First, we want to split the data in to groups based on when they cross the cutoff. We can use run length encoding on the indicator to get a compressed version:

x.rle <- rle(x > 2)

which has the value:

Run Length Encoding
  lengths: int [1:8] 5 2 3 1 9 4 5 1
  values : logi [1:8] FALSE TRUE FALSE TRUE FALSE TRUE ...

The first group is the first 5 points where x > 2 is FALSE; the second group is the two following points, and so on.

We can create a group id by replacing the values in the rle object, and then back transforming:

x.rle$values <- seq_along(x.rle$values)
group <- inverse.rle(x.rle)

Finally, we aggregate by group, keeping only the data above the cut off:

aggregate(x~group, subset = x > 2, FUN=sum)

Which produces:

  group            x
1     2  5.113291213
2     4  2.124118005
3     6 11.775435706
4     8  2.175868979

Upvotes: 2

rosscova

Reputation: 5590

I'd use data.table for this, although there are certainly other ways.

library( data.table )
setDT( df )
temp.threshold <- 18

First make a column showing the previous value from each one in your data. This will help to find the point at which the temperature rose above your threshold value.

df[ , lag := shift( Temp, fill = 0, type = "lag" ) ]

Now use that previous value column to compare with the Temp column. Mark every point at which the temperature rose above the threshold with a 1, and all other points as 0.

df[ , group := 0L 
    ][ Temp > temp.threshold & lag <= temp.threshold, group := 1L ]

Now we can get cumsum of that new column, which will give each sequence after the temperature rose above the threshold its own group ID.

df[ , group := cumsum( group ) ]

Now we can get rid of every value not above the threshold.

df <- df[ Temp > temp.threshold, ]

And summarise what's left by finding the "degree hours" of each "group".

bygroup <- df[ , sum( Temp - temp.threshold ), by = group ]

I modified your input data a little to provide a couple of test events where the data rose above threshold:

structure(list(num = c(1L, 2L, 3L, 4L, 5L, 21189L, 21190L, 21191L, 
21192L, 21193L, 21194L), Temp = c(14.026, 13.714, 13.25, 20, 
19, 12.437, 12.558, 12.703, 12.896, 21, 21)), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -11L), .Names = c("num", 
"Temp"), spec = structure(list(cols = structure(list(num = structure(list(), class = c("collector_integer", 
"collector")), Temp = structure(list(), class = c("collector_double", 
"collector"))), .Names = c("num", "Temp")), default = structure(list(), class = c("collector_guess", 
"collector"))), .Names = c("cols", "default"), class = "col_spec"))

With that data, here's the output of the code above (note $V1 is in "degree hours"):

> bygroup
   group V1
1:     1  3
2:     2  6

Upvotes: 1

Add consecutive temp values above threshold to create &quot;degree hours&quot;

Answers (2)

Related Questions

Add consecutive temp values above threshold to create "degree hours"