Reputation: 175
I have two arrays with different sizes I can easily find out the KS statistics using scipy.stats.ks_2samp. But how can I draw CDF like the picture?
Upvotes: 2
Views: 1627
Reputation: 211
One big part of this is sns.ecdfplot
. In particular see this SO answer:
Code to produce the cumulative distribution plots:
import seaborn as sns import pandas as pd sns.ecdfplot(pd.DataFrame({'x': x, 'z': z}))
Edit: It is easier to use np.cumsum
like in this SO answer:
As mentioned, cumsum from numpy works well. Make sure that your data is a proper PDF (ie. sums to one), otherwise the CDF won't end at unity as it should. Here is a minimal working example:
import numpy as np from pylab import * # Create some test data dx = 0.01 X = np.arange(-2, 2, dx) Y = np.exp(-X ** 2) # Normalize the data to a proper PDF Y /= (dx * Y).sum() # Compute the CDF CY = np.cumsum(Y * dx) # Plot both plot(X, Y) plot(X, CY, 'r--') show()
Upvotes: 0