Reputation: 1592
I'm trying to add pareto front to scatter plot I have. The scatter plot data is:
array([[1.44100000e+04, 3.31808987e+07],
[1.21250000e+04, 3.22901074e+07],
[6.03000000e+03, 2.84933900e+07],
[8.32500000e+03, 2.83091317e+07],
[6.68000000e+03, 2.56373373e+07],
[5.33500000e+03, 1.89331461e+07],
[3.87500000e+03, 1.84107940e+07],
[3.12500000e+03, 1.60416570e+07],
[6.18000000e+03, 1.48054565e+07],
[4.62500000e+03, 1.33395341e+07],
[5.22500000e+03, 1.23150492e+07],
[3.14500000e+03, 1.20244820e+07],
[6.79500000e+03, 1.19525083e+07],
[2.92000000e+03, 9.18176770e+06],
[5.45000000e+02, 5.66882578e+06]])
and the the scatter plot looks like this:
I have used this tutorial in order to plot the pareto, but for some reason the result is very weird and I get tiny red line :
This is the code I have used:
def identify_pareto(scores):
# Count number of items
population_size = scores.shape[0]
# Create a NumPy index for scores on the pareto front (zero indexed)
population_ids = np.arange(population_size)
# Create a starting list of items on the Pareto front
# All items start off as being labelled as on the Parteo front
pareto_front = np.ones(population_size, dtype=bool)
print(pareto_front)
# Loop through each item. This will then be compared with all other items
for i in range(population_size):
# Loop through all other items
for j in range(population_size):
# Check if our 'i' pint is dominated by out 'j' point
if all(scores[j] >= scores[i]) and any(scores[j] > scores[i]):
# j dominates i. Label 'i' point as not on Pareto front
pareto_front[i] = 0
# Stop further comparisons with 'i' (no more comparisons needed)
break
# Return ids of scenarios on pareto front
return population_ids[pareto_front]
pareto = identify_pareto(scores)
pareto_front_df = pd.DataFrame(pareto_front)
pareto_front_df.sort_values(0, inplace=True)
pareto_front = pareto_front_df.values
#here I get as output weird results:
>>>
array([[ 5, 81],
[15, 80],
[30, 79],
[55, 77],
[70, 65],
[80, 60],
[90, 40],
[97, 23],
[99, 4]])
x_all = scores[:, 0]
y_all = scores[:, 1]
x_pareto = pareto_front[:, 0]
y_pareto = pareto_front[:, 1]
plt.scatter(x_all, y_all)
plt.plot(x_pareto, y_pareto, color='r')
plt.xlabel('Objective A')
plt.ylabel('Objective B')
plt.show()
the result is the tiny red line.
My question is, where is my mistake? how can I get back the pareto line?
Upvotes: 1
Views: 1920
Reputation: 407
I don't think there is anything wrong in your code but rather the way your data is represented by scores (If scores is the first array you presented).
The first element of the array [1.44100000e+04, 3.31808987e+07]
is really large as compared to other values and hence it's the only outer iteration inside the function where if all(scores[j] >= scores[i]) and any(scores[j] > scores[i]):
condition is not met and not reduced to zero. All other points are reduced to zero.
I believe this is the only point plotted as red dot.
Upvotes: 1