Direnk
Direnk

Reputation: 1

Finding Inflection Points on geom_line

I am stuck in finding inflection points for cumulative rate over time graphic I plotted with ggplot's geom_line.

The data is here.

In case I need to take a few steps back for a more feasible/better approach, here's my process. I assigned "1" for all cases (counter column). I ordered the data by date and calculated cumsum for each case (cumulative column). I divided this cumulative case to the total column (7083 in this case) to have the cumulative rate (rate column).

I use ggplot to see how cumulative rate changes over time.

zip <- read_csv("example.csv")

ggplot(data=zip, aes(date, rate)) + geom_line(color = "#275695", size = 1)

Result

What I want to know is where the inflection point happens on this geom_line. I am aware that there are different inflection points, but I want to know where this rate "takes off." It is around 0.13 in this case. I need to carry out this analysis for hundreds of dataframes and calculate the average "taking off" point.

Any ideas or approaches would be extremely helpful!

Many thanks!

Upvotes: 0

Views: 700

Answers (1)

Allan Cameron
Allan Cameron

Reputation: 173948

I think you're being fooled by the shape of this curve, which shows approximately exponential growth.

To see what I mean, let's just plot from 2005 to 2010:

ggplot(data = zip, aes(date, rate)) + 
  geom_line(color = "#275695", size = 1) +
  coord_cartesian(xlim = as.Date(c('2005-01-01', '2010-01-01')),
                  ylim = c(0, 0.015))

enter image description here

Wow - it really "takes off" about 2009. Maybe there's some kind of inflection point in there?

Now let's plot 2005 to 2012:

ggplot(data = zip, aes(date, rate)) + 
  geom_line(color = "#275695", size = 1) +
  coord_cartesian(xlim = as.Date(c('2005-01-01', '2012-01-01')),
                  ylim = c(0, 0.045))

enter image description here

Wow! Forget 2009! It was 2010 when things really took off. In fact, we can now see 2009 looks like it was hardly taking off at all. What were we thinking? There's probably an inflection point around 2010 to 2011 somewhere, right?

Let's now plot out to 2014:

ggplot(data = zip, aes(date, rate)) + 
  geom_line(color = "#275695", size = 1) +
  coord_cartesian(xlim = as.Date(c('2005-01-01', '2014-01-01')),
                  ylim = c(0, 0.125))

enter image description here

Hmm. Now it looks like 2010 wasn't that dramatic after all, but check out our "inflection point" in 2012.


It seems that our plot keeps the same over all shape as we increase the x axis, and it is always tempting to think there is an inflection point about 2/3 of the way along when the plot "really takes off", but that just reflects the fact that we're not very good at intuitively grasping how exponential growth looks when plotted on normal axes.

In fact, if we plot it with a logarithmic y axis, we get the following:

ggplot(data = zip, aes(date, rate)) + 
  geom_line(color = "#275695", size = 1) + 
  scale_y_log10()

enter image description here

We can see from this that there is actually very clear exponential growth between 2005 and 2013. The growth then slows until some time in 2015. After this it picks back up, but the crucial point is that the part where you think the plot visually "takes off" actually represents slower growth in relative terms than anywhere in the period 2005 to 2013.

The answer to your question then is that there is no inflection point at which growth really takes off. There is steady exponential growth with three different rates, but the highest rate of growth happens in the left of the curve - it's just that the plot is too "zoomed out" to see this.

Upvotes: 3

Related Questions