Surabhi Amit Chembra
Surabhi Amit Chembra

Reputation: 561

Numpy vectorize python for loop

This is a code snippet using Keras library for creating models:

    for state, action, reward, next_state, done in minibatch:
        target = reward
        if not done:
            target = (reward + self.gamma *
                      np.amax(self.model.predict(next_state)[0]))
        target_f = self.model.predict(state)
        #print (target_f)
        target_f[0][action] = target
        self.model.fit(state, target_f, epochs=1, verbose=0)

I am trying to vectorize it. The only way I think to do is : 1. Create a numpy table with each row = (state, action, reward, next_state, done, target). So, there will be "mini-batch" number of rows. 2. Update target column based on other columns as (using masked arrays):

target[done==True] ==reward
target[done==False] == reward + self.gamma 
*np.amax(self.model.predict(next_state)[0])
  1. Now update self.model.fit(state, target_f, epochs=1, verbose=0)

NB: state is 8-D, so state vector has 8 elements.

Despite hours of efforts, I am unable to code this properly. Is it possible to actually vectorize this piece of code?

Upvotes: 2

Views: 309

Answers (1)

Andreas Storvik Strauman
Andreas Storvik Strauman

Reputation: 1655

You are very close! Assuming that minibatch is an np.array:

First find all the indices where done is true. Assuming done is index number 4.

minibatch_done=minibatch[np.where(minibatch[:,4]==True)]
minibatch_not_done=minibatch[np.where(minibatch[:,4]==False)]

Now we use this to update the minibatch matrix conditionally. Assuming index 2 is reward and index 3 is next_state

target = np.empty((minibatch.shape[0]))
n_done = minibatch_done.shape[0]
# First half (index 0...n_done)
target[:n_done] = minibatch_done[:,2]+self.gamma*np.amax(self.model.predict(minibatch_done[:,3]))
target[n_done:] = minibatch_not_done[:,2]

And there you have it :)

Edit: Fixed index error in target problems

Upvotes: 3

Related Questions