Reputation: 912
I'm trying to plot the number of features vs variance explained for PCA. I'd like to highlight when the variance is > 95% using the line color
I've got the following code so far..
pcaPlotData = {
'r':var[np.argwhere(var < 95)],
'g':var[np.argwhere(var >= 95)]
}
fig, ax = plt.subplots()
for k, v in pcaPlotData.items():
ax.plot(v, color = k)
ax.set_ylabel('% Variance Explained')
ax.set_xlabel('# of Features')
ax.set_title('PCA Analysis')
ax.set_ylim(var.min(),var.max()+1)
plt.show()
which outputs the following plot:
However the green line should start when the red line ends, how would i go about offsetting the green line?
Upvotes: 0
Views: 1010
Reputation: 39042
If your input data is NumPy arrays, an alternative solution could look like the following. Here you create a conditional mask
and then use ~mask
to access the elements which do not fulfill the condition. This saves you from creating the mask twice.
Following is a complete runnable example:
import numpy as np
import matplotlib.pyplot as plt
x = np.arange(40)
var = x**2
# Define the conditional mask
mask = (var<95)
plt.plot(x[mask], var[mask], 'r') # Data fulfilling the condition
plt.plot(x[~mask], var[~mask], 'g') # Data not fulfilling the condition
plt.show()
Upvotes: 1
Reputation: 15070
Simply this:
x = np.argwhere(var < 95)
ax.plot(x, var[x], 'r')
x = np.argwhere(var >= 95)
ax.plot(x, var[x], 'g')
Upvotes: 0