Reputation: 57611
I'm trying to adapt a C program on reinforcement learning, https://webdocs.cs.ualberta.ca/~sutton/book/code/pole.c, to Python to participate in the OpenAI Gym. I've copied the get_box
function into a separate test program:
#include <stdio.h>
int get_box(float x, float x_dot, float theta, float theta_dot);
int main() {
int box;
box = get_box(0.01, 0.01, 0.01, 0.01);
printf("The value of box is : %x\n", box);
return 0;
}
#define one_degree 0.0174532 /* 2pi/360 */
#define six_degrees 0.1047192
#define twelve_degrees 0.2094384
#define fifty_degrees 0.87266
int get_box(x,x_dot,theta,theta_dot)
float x,x_dot,theta,theta_dot;
{
int box=0;
if (x < -2.4 ||
x > 2.4 ||
theta < -twelve_degrees ||
theta > twelve_degrees) return(-1); /* to signal failure */
if (x < -0.8) box = 0;
else if (x < 0.8) box = 1;
else box = 2;
if (x_dot < -0.5) ;
else if (x_dot < 0.5) box += 3;
else box += 6;
if (theta < -six_degrees) ;
else if (theta < -one_degree) box += 9;
else if (theta < 0) box += 18;
else if (theta < one_degree) box += 27;
else if (theta < six_degrees) box += 36;
else box += 45;
if (theta_dot < -fifty_degrees) ;
else if (theta_dot < fifty_degrees) box += 54;
else box += 108;
return(box);
}
which I call scratch.c
. If I compile this program with gcc scratch.c -lm
and run it with ./a.out
, I get the following printed output:
The value of box is : 55
However, if I go through the conditional statements manually I would expect to get 1 + 3 + 27 + 54 = 85, which is also what I get with my Python program. Why does the program print 55?
Upvotes: 0
Views: 73
Reputation: 64
Because your output getting converted to hexadecimal number. If you convert 55 to decimal, then it is equivalent of 85.
Upvotes: -1
Reputation: 2882
If you'd do a printf("%d\n", box)
instead of printf("%x\n", box)
you'll get the decimal value printed. 0x55 = 5*16 + 5 = 85
Upvotes: 3