Reputation: 137
so I am using a convolutional layer as the first layer of a neural network for deep reinforcement learning to get the spatial features out of a simulation I built. The simulation gives different maps that are of different lengths and heights to process. If I understand convolutional networks, this should not matter since the channel size is kept constant. In between the convolutional network and the fully connected layers there is a spatial pyramid pooling layer so that the varying image sizes does not matter. Also the spatial data is pretty sparse. Usually it is able to go through a few states and sometimes a few episodes before the first convolutional layer spits out all Nans. Even when I fix the map size this happens. I do not know where the problem lies, where can the problem lie?
Upvotes: 0
Views: 2432
Reputation: 326
Try to initialize your weights with random numbers between 0 and 1 and then try different learning rates for your network training. (I suggest test it with learning rates equal to 10, 1, 0.1, 0.01, ...)
Upvotes: 0