znat
znat

Reputation: 13474

Tensorflow - How to implement hyper parameters random search?

Consider this simple graph + session definition. Suppose I want to tune hyper params (learning rate and drop out keep probability) with a random search? What is the recommended way to implement it?

graph = tf.Graph()
with graph.as_default():

    # Placeholders
    data = tf.placeholder(tf.float32,shape=(None,  img_h, img_w, num_channels),name='data')
    labels = ...
    dropout_keep_prob = tf.placeholder(tf.float32, name='keep_prob')
    learning_rate = tf.placeholder(tf.float32, name='learning_rate')

    # model architecture...

with tf.Session(graph=graph) as session:
    tf.initialize_all_variables().run()
    for step in range(num_steps):
        offset = (step * batch_size) % (train_length.shape[0] - batch_size)
        # Generate a minibatch.
        batch_data = train_images[offset:(offset + batch_size), :]
        #...
        feed_train = {data: batch_data, 
                      #...
                      learning_rate: 0.001,
                      keep_prob : 0.7
                     }

I tried putting everything inside a function

def run_model(learning_rate,keep_prob):
    graph = tf.Graph()
    with graph.as_default():
    # graph here...

    with tf.Session(graph=graph) as session:
        tf.initialize_all_variables().run()
        # session here...

But I ran into scope issues (I am not very familiar with scopes in Python/Tensoflow). Is there a best practice to achieve this?

Upvotes: 9

Views: 4161

Answers (1)

Zhongyu Kuang
Zhongyu Kuang

Reputation: 5344

I implemented random search of hyper-parameter in a similar way, and things worked out fine. Basically what I did was I have a function general random hyper-parameters outside of graph and session. I wrapped the graph and session into a function as you did, and I passed on the generated hyper-parameters. See the code for better illustration.

def generate_random_hyperparams(lr_min, lr_max, kp_min, kp_max):
    '''generate random learning rate and keep probability'''
    # random search through log space for learning rate
    random_learng_rate = 10**np.random.uniform(lr_min, lr_max)
    random_keep_prob = np.random.uniform(kp_min, kp_max)
    return random_learning_rate, random_keep_prob

I suspect the scope issue you are running into (since you didn't provide the exact error message I can only speculate) is caused by some careless naming... I would modify how you are naming variables in your run_model function.

def run_model(random_learning_rate,random_keep_prob):
    # Note that the arguments is named differently from the placeholders in the graph
    graph = tf.Graph()
    with graph.as_default():
        # graph here...
        learning_rate = tf.placeholder(tf.float32, name='learning_rate')
        keep_prob = tf.placeholder(tf.float32, name='keep_prob')
        # other operation ...

    with tf.Session(graph=graph) as session:
        tf.initialize_all_variables().run()
        # session here...
        feed_train = {data: batch_data, 
                  #placeholder variable names as dict key, python value variables as dict value
                  learning_rate: random_learning_rate,
                  keep_prob : random_keep_prob
                 }
        # evaluate performance with random_learning_rate and random_keep_prob
        performance = session.run([...], feed_dict = feed_train)
    return performance

Remember to use different variable names to name the tf.placeholders and the ones carrying the real python values.

The usage of above snippets would be something like:

performance_records = {}
for i in range(10): # random search hyper-parameter space 10 times
    random_learning_rate, random_keep_prob = generate_random_hyperparams(-5, -1, 0.2, 0.8)
    performance = run_model(random_learning_rate, random_keep_prob)
    performance_records[(random_learning_rate, random_keep_prob)] = performance

Upvotes: 4

Related Questions