ajikodajis
ajikodajis

Reputation: 45

Q-table representation

As far as I understand Q-learning, a Q-value is a measure of "how good" a particular state-action pair is. This is usually represented in a table in one of the following ways (see fig.):

enter image description here

  1. Are both representations valid?
  2. How do you determine the best action if the Q-table is given as a state to state transition table (as shown in the top q-table in the figure), especially if the state transitions are not deterministic (i.e. taking an action from a state can land you in different states at different times?)

Upvotes: 0

Views: 1129

Answers (1)

Don Reba
Don Reba

Reputation: 14031

  1. No. In general, an action is not equivalent to a transition to a particular state. There can be a different number of actions than states, the same action could lead to different states depending on which state it is performed in, and different actions could lead to the same state. Transitions can also be stochastic.

  2. See (1).

Upvotes: 2

Related Questions