ako
ako

Reputation: 3689

numpy / scipy: Making one series converge towards another after a period of time

I have a number of series in a pandas dataframe representing rates observed yearly.

For an experiment, I want some of these series' rates to converge towards one of the other series' rate in the last observed year.

For example, say I have this data, and I decide column a is a meaningful target for column b to approach asymptotically over, say, a ten year period in small, even sized increments (or decreasing; doesn't really matter).

I could of course do this in a loop, but I was wondering if there was a more general numpy or scipy vectorized way of making one series approach another asymptotically off the shelf.

rate               a         b                  
year                                                                       
2006               0.393620  0.260998          
2007               0.408620  0.260527
2008               0.396732  0.257396 
2009               0.418029  0.249123 
2010               0.414246  0.253526  
2011               0.415873  0.256586  
2012               0.414616  0.253865     
2013               0.408332  0.257504    
2014               0.401821  0.259208  

Upvotes: 6

Views: 1290

Answers (2)

smheidrich
smheidrich

Reputation: 4548

All right so this is just the procedure you described in your comment in code form, assuming a and b are your two numpy arrays:

b += (a[-1]-b[-1])/len(b)*numpy.arange(1,len(b)+1)

(a[-1]-b[-1])/len(b) is one "chunk" and one more of them is added in each "iteration" (year) via multiplication with a numpy.arange() array. I tried a few plots and it doesn't look good unless you tweak it, but it's what you asked for.

Example of what this looks like

Upvotes: 3

Joe Kington
Joe Kington

Reputation: 284820

Generally speaking, you'd apply an "easing function" over some range.

For example, consider the figure below:

enter image description here

Here, we have two original datasets. We'll subtract the two, multiply the difference by the easing function shown in the third row, and then add the result back to the first curve. This will result in a new series that is the original data to the left of the gray region, a blend of the two within the gray region, and data from the other curve to the right of the gray region.

As an example:

import numpy as np
import matplotlib.pyplot as plt

# Generate some interesting random data
np.random.seed(1)
series1 = np.random.normal(0, 1, 1000).cumsum() + 20
series2 = np.random.normal(0, 1, 1000).cumsum()
# Our x-coordinates
index = np.arange(series1.size)

# Boundaries of the gray "easing region"
i0, i1 = 300, 700    

# In this case, I've chosen a sinusoidal easing function...
x = np.pi * (index - i0) / (i1 - i0)
easing = 0.5 * np.cos(x) + 0.5

# To the left of the gray region, easing should be 1 (all series2)
easing[index < i0] = 1

# To the right, it should be 0 (all series1)
easing[index >= i1] = 0

# Now let's calculate the new series that will slowly approach the first
# We'll operate on the difference and then add series1 back in 
diff = series2 - series1
series3 = easing * diff + series1

Also, if you're curious about the plot above, here's how it's generated:

fig, axes = plt.subplots(nrows=4, sharex=True)

axes[0].plot(series1, color='lightblue', lw=2)
axes[0].plot(series2, color='salmon', lw=1.5)
axes[0].set(ylabel='Original Series')

axes[1].plot(diff, color='gray')
axes[1].set(ylabel='Difference')

axes[2].plot(easing, color='black', lw=2)
axes[2].margins(y=0.1)
axes[2].set(ylabel='Easing')

axes[3].plot(series1, color='lightblue', lw=2)
axes[3].plot(series3, color='salmon', ls='--', lw=2, dashes=(12,20))
axes[3].set(ylabel='Modified Series')

for ax in axes:
    ax.locator_params(axis='y', nbins=4)
for ax in axes[-2:]:
    ax.axvspan(i0, i1, color='0.8', alpha=0.5)

plt.show()

Upvotes: 5

Related Questions