Adam Robinsson
Adam Robinsson

Reputation: 1761

R: Find X-value corresponding to specific Y-value on graph

I study a continuous variable, measured every ten minutes for 2 hours. I wonder what at what time the variable has doubled and tripled.

Example data:
# The time variable
time <- seq(from = 0, to = 120, by=10)
# The measured variable
value <- c(5, 5.5, 7.8, 8.3, 9.5, 10.9, 11.5, 12, 13, 14, 12.5, 11.1, 9)
# Put together
df <- data.frame(time, value)
# Plotted
ggplot(df, aes(time, value)) + geom_line()

enter image description here

# At what time point (what X value) does Y equal (for example) 10?

# I've tried (according to previous suggestions on this site (but they turned out to be not reliable, and heavily dependent upon the "interval" specified.

f1 <- approxfun(df$time, df$value)
optimize(function(t0) abs(f1(t0) - 10), interval = c(0, 120))[[1]]

Does anyone know of any other function that can find the X value without dependence upon an interval. The reason for me asking again, is that changing the interval slightly (but keeping it within the true value) changes the result...

Thanks for any advice

Upvotes: 1

Views: 12304

Answers (1)

AntoniosK
AntoniosK

Reputation: 16121

I don't know if it is useful and practical for you, but my idea is to fit a (polynomial) curve to your data and then use this curve to "predict" (find) your x value for any y value. In case your y value corresponds to multiple x values you'll keep the first one.

I suggest you to run the process step by step to see how your initial dataset gets transformed.

library(ggplot2)
library(dplyr)

# The time variable
time <- seq(from = 0, to = 120, by=10)
# The measured variable
value <- c(5, 5.5, 7.8, 8.3, 9.5, 10.9, 11.5, 12, 13, 14, 12.5, 11.1, 9)
# Put together
df <- data.frame(time, value)

# Plot value (x axis) againt time (y axis)
ggplot(df, aes(time, value)) + 
  geom_point()

enter image description here

You need a process that excludes the overlapping parts. I'm using a process that spots when the "value" value (x axis) starts getting smaller. Those cases are excluded.

# create a row index
df %>% mutate(id = row_number()) -> df

df_updated = 
    df %>%
    group_by(id) %>%          # for each row
    do(data.frame(.,max_value = max(df$value[df$id <= .$id]))) %>%   # obtain the maximum value up to that point
    ungroup() %>%
    filter(value >= max_value)     # exclude declining parts


# Plot value (x axis) againt time (y axis) from the updated dataset
ggplot(df_updated, aes(time, value)) + 
  geom_point()

enter image description here

Those are the data points you need to consider.

# filt a polynomial curve that best describes your data
fit <- lm(time~poly(value,8,raw=TRUE), data = df_updated)   ## NOTE that here it requires some extra work to find which degree gives you an acceptable fit (you can create a process that calculates your optimal degree. Here I used 8).

# check how good your fitting is
ggplot(df_updated, aes(time, value)) + 
  geom_point() +
  geom_line(aes(predict(fit, df_updated), value))

enter image description here

# get the time at value = 10
predict(fit, data.frame(value=10))

 #        1 
 # 41.67011 

Upvotes: 1

Related Questions