Achilles_03
Achilles_03

Reputation: 21

Finding consecutive values beneath a threshold

I am trying to write some code to define some boundary layers.

The data is arranged in the table like this (but there are many more rows):

Depth (um) Replicate 1 (O2 Sat %) Replicate 2 (O2 Sat %) Replicate 3 (O2 Sat %)
0 10 11 11
-100 11 11 12
-200 13 12 11
-300 14 13 14
-400 15 15 15
-500 16 16 16

For each of these replicates I want to find the size of the boundary layer. I am defining the boundary layer as the distance above the surface (0um) at which the changes are <5% per 100um for for subsequent measurements. So I need to find the depth of the first row that results in this definition not being met. I also need it to make sure that the function is using rows 1-4 then 2-5 then 3-6 and so on not just moving down the rows 1-4,5-8 etc etc so that I can identify the first time this boundary layer definition is not met. I would like to detect this change for each replicate.

I have tried looking for some ways to approach it but I am not sure I am asking the correct questions because I am not exactly sure which type of functions to start with. I am assuming this may be some type of threshold or cutoff type function but I thought I would get some ideas on how to proceed as my searching was not getting me anywhere.

I appreciate any advice or ideas on how to get started on this. Thank you for your time in advance.

Upvotes: 2

Views: 116

Answers (1)

thelatemail
thelatemail

Reputation: 93803

I think something like this can be accomplished with diff and a moving sum (via stats::filter):

Make some example data:

dat <- read.table(text="
Depth   Replicate1      Replicate2 Replicate3
0       10      11      11
-100    11      11      12
-200    13      12      11
-300    14      13      14
-400    15      15      15
-500    16      16      16", header=TRUE)

Set some variables:

cutoff      <- 0.05
windowsize  <- 4

Calculate the percentage difference row to row:

percdiff    <- diff(as.matrix(dat[-1])) / dat[-nrow(dat), -1]
percdiff
#  Replicate1 Replicate2  Replicate3
#1 0.10000000 0.00000000  0.09090909
#2 0.18181818 0.09090909 -0.08333333
#3 0.07692308 0.08333333  0.27272727
#4 0.07142857 0.15384615  0.07142857
#5 0.06666667 0.06666667  0.06666667

Check the percentage difference is above 5% at each row:

percdiff_co <- percdiff > cutoff
percdiff_co
#  Replicate1 Replicate2 Replicate3
#1       TRUE      FALSE       TRUE
#2       TRUE       TRUE      FALSE
#3       TRUE       TRUE       TRUE
#4       TRUE       TRUE       TRUE
#5       TRUE       TRUE       TRUE

Calculate a moving sum for each replicate in the 4-observation window:

out <- stats::filter(percdiff_co, rep(1,windowsize), sides=1) 
out
#Time Series:
#Start = 1 
#End = 5 
#Frequency = 1 
#  [,1] [,2] [,3]
#1   NA   NA   NA
#2   NA   NA   NA
#3   NA   NA   NA
#4    4    3    3
#5    4    4    3

Check if the moving sum of greater than the cutoff is always TRUE:

out <- out == windowsize
tail(out, -(windowsize-1))
#     [,1]  [,2]  [,3]
#[1,] TRUE FALSE FALSE
#[2,] TRUE  TRUE FALSE

Upvotes: 1

Related Questions