Reputation: 1761
I study a continuous variable, measured every ten minutes for 2 hours. I wonder what at what time the variable has doubled and tripled.
Example data:
# The time variable
time <- seq(from = 0, to = 120, by=10)
# The measured variable
value <- c(5, 5.5, 7.8, 8.3, 9.5, 10.9, 11.5, 12, 13, 14, 12.5, 11.1, 9)
# Put together
df <- data.frame(time, value)
# Plotted
ggplot(df, aes(time, value)) + geom_line()
# At what time point (what X value) does Y equal (for example) 10?
# I've tried (according to previous suggestions on this site (but they turned out to be not reliable, and heavily dependent upon the "interval" specified.
f1 <- approxfun(df$time, df$value)
optimize(function(t0) abs(f1(t0) - 10), interval = c(0, 120))[[1]]
Does anyone know of any other function that can find the X value without dependence upon an interval. The reason for me asking again, is that changing the interval slightly (but keeping it within the true value) changes the result...
Thanks for any advice
Upvotes: 1
Views: 12304
Reputation: 16121
I don't know if it is useful and practical for you, but my idea is to fit a (polynomial) curve to your data and then use this curve to "predict" (find) your x value for any y value. In case your y value corresponds to multiple x values you'll keep the first one.
I suggest you to run the process step by step to see how your initial dataset gets transformed.
library(ggplot2)
library(dplyr)
# The time variable
time <- seq(from = 0, to = 120, by=10)
# The measured variable
value <- c(5, 5.5, 7.8, 8.3, 9.5, 10.9, 11.5, 12, 13, 14, 12.5, 11.1, 9)
# Put together
df <- data.frame(time, value)
# Plot value (x axis) againt time (y axis)
ggplot(df, aes(time, value)) +
geom_point()
You need a process that excludes the overlapping parts. I'm using a process that spots when the "value" value (x axis) starts getting smaller. Those cases are excluded.
# create a row index
df %>% mutate(id = row_number()) -> df
df_updated =
df %>%
group_by(id) %>% # for each row
do(data.frame(.,max_value = max(df$value[df$id <= .$id]))) %>% # obtain the maximum value up to that point
ungroup() %>%
filter(value >= max_value) # exclude declining parts
# Plot value (x axis) againt time (y axis) from the updated dataset
ggplot(df_updated, aes(time, value)) +
geom_point()
Those are the data points you need to consider.
# filt a polynomial curve that best describes your data
fit <- lm(time~poly(value,8,raw=TRUE), data = df_updated) ## NOTE that here it requires some extra work to find which degree gives you an acceptable fit (you can create a process that calculates your optimal degree. Here I used 8).
# check how good your fitting is
ggplot(df_updated, aes(time, value)) +
geom_point() +
geom_line(aes(predict(fit, df_updated), value))
# get the time at value = 10
predict(fit, data.frame(value=10))
# 1
# 41.67011
Upvotes: 1