How do I query the best solution of a pyGAD GA instance?

Question

I've trained a population of neural networks using using the genetic algorithm implementation provided by the pyGAD Python Library. The code I've written so far is given below:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import pygad.gann
import time
import pickle

ret = -1
n_sect = 174
population_size = 500
num_parents_mating = 4 
num_generations = 1000
mutation_percent = 5
parent_selection_type = "rank"
crossover_type = "two_points"
mutation_type = "random"
keep_parents = 1
init_range_low = -2
init_range_high = 5
n_div = 15

data = pd.read_csv("delta_results/sub_delta_{}.csv".format(n_sect), index_col=0)
data.index = pd.to_datetime(data.index)
data = list(data["Delta"])

function_inputs = np.array([data[i:i+n_div][:ret] for i in range(0, len(data), n_div)])
required_outputs = np.array([[data[i:i+n_div][ret]] for i in range(0, len(data), n_div)])

input_layer_size = function_inputs.shape[1]
n_hidden_layers = 2
hidden_layer_1_size = input_layer_size - 2
hidden_layer_2_size = input_layer_size - 4
output_layer_size = 1

population = pygad.gann.GANN(
    num_solutions=population_size, 
    num_neurons_input=input_layer_size, 
    num_neurons_output=output_layer_size, 
    num_neurons_hidden_layers=[hidden_layer_1_size, hidden_layer_2_size], # 2 Hidden Layers
    hidden_activations=["relu", "relu"],
    output_activation="None"
)

population_vectors = pygad.gann.population_as_vectors(population_networks=population.population_networks)

initial_population = population_vectors.copy()

def normalize(x):
    return x/np.linalg.norm(x, ord=2, axis=0, keepdims=True)

def fitness(solution, solution_index):
    prediction = pygad.nn.predict(last_layer=population.population_networks[solution_index], data_inputs=function_inputs, problem_type="regression")
    prediction = np.array(prediction)
    error = (prediction+0.0001)-required_outputs
    fitness = np.nan_to_num((np.abs(error)**(-2))).astype(np.float64)
    solution_fitness = np.sum(normalize(fitness))
    return solution_fitness

def on_generation(population_instance):
    global population
    population_matrices = pygad.gann.population_as_matrices(population_networks=population.population_networks, population_vectors=population_instance.population)
    population.update_population_trained_weights(population_trained_weights=population_matrices)

population_instance = pygad.GA(
    num_generations=num_generations,
    num_parents_mating=num_parents_mating,
    initial_population=initial_population,
    fitness_func=fitness,
    mutation_percent_genes=mutation_percent,
    init_range_low=init_range_low,
    init_range_high=init_range_high,
    parent_selection_type=parent_selection_type,
    crossover_type=crossover_type,
    mutation_type=mutation_type,
    keep_parents=keep_parents,
    on_generation=on_generation
)

saved_population = pygad.load(filename=".../population_data_v2")
best_solution = saved_population.best_solution()
print("Population Best Solution Info:
| Attributes:
{}
| Fitness: {}
| Solution Index: {}".format(best_solution[0], best_solution[1], best_solution[2]))
saved_population.plot_result()

Once the genetic algorithm is run, I save the population data into a file called population_data_v2.pkl (not shown above) - and the file is created & saved successfully.

However, once I open the file I don't know how I can find information of the best neural network from the population.

All I get is a nd.numpy.array of the solution (best_solution[0]) which I don't know how to query from, or how to pass in the function inputs and see what the prediction of the best solution is.

Any help would be greatly appreciated!

Ahmed Gad · Accepted Answer

Thanks for using PyGAD.

I see that you built the example correctly. You can easily use the best solution to make predictions using simple 3 steps.

Please note that after each generation, the population attribute is updated by the latest population. That means after PyGAD completes all the generations, the last population is saved in the population attribute.

Step 1

After you load the saved model using the pygad.load() function, and as you did in the fitness function, you can use the population attribute to restore the weights of the networks as follows:

population_matrices = pygad.gann.population_as_matrices(population_networks=population.population_networks, population_vectors=saved_population.population)
population.update_population_trained_weights(population_trained_weights=population_matrices)

Step 2

The best_solution() method returns 3 outputs where the third one represents the index of the best solution. You can use it to make predictions as follows:

best_solution = saved_population.best_solution()
prediction = pygad.nn.predict(last_layer=population.population_networks[best_solution[2]], data_inputs=function_inputs, problem_type="regression")

Step 3

Finally, you can print the predicted values:

prediction = np.array(prediction)
print("Prediction of the best solution: {pred}".format(pred=prediction))

Complete Code

Out of the above discussion, here is the code that works for you to make predictions based on the best solution:

population_matrices = pygad.gann.population_as_matrices(population_networks=population.population_networks, population_vectors=saved_population.population)
population.update_population_trained_weights(population_trained_weights=population_matrices)

best_solution = saved_population.best_solution()
prediction = pygad.nn.predict(last_layer=population.population_networks[best_solution[2]], data_inputs=function_inputs, problem_type="regression")

prediction = np.array(prediction)
print("Prediction of the best solution: {pred}".format(pred=prediction))

In case something is not working, please let me know.

Thanks again for using PyGAD.

How do I query the best solution of a pyGAD GA instance?

Answers (1)

Step 1

Step 2

Step 3

Complete Code

Related Questions