Reputation: 184
I am trying to implement a Convolutional Neural Network in python. The architecture is as follows:
INPUT->[Convolution->Sigmoid->Pooling]->[Convolution->Sigmoid->Pooling]->Fully Connected Layer-> Hidden Layer->Ouput
.
input shape: 28*28
Filters/weights shape for COnvolutional layer1: 20*1*5*5
Filters/weights shape for COnvolutional layer2: 40*20*5*5
Activation Function: Sigmoid (1/(1+e^-x))
Due to the large shape of filters/weights, while applying the dot product in COnvolutional Layer 2, the resulting output values are near to 20 or higher which is subsequently resulting in the output after sigmoid activation function values to be all 1's.
Output at COnvolutional layer1:
[ 0.75810452 0.79819809 0.70897314 0.50897858 0.02901152 0.98447587
0.99995668 0.99999814 0.99912627 0.7885211 0.87708188 0.76611807]
...
...
Output at COnvolutional layer2:
[ 19.88641441 20.11005634 20.04984707 20.19106394 19.93096274
20.1585536 19.84757161 19.79030395]
...
...
output after applying sigmoid on convlayer2:
[ 1. 1. 1. 1. 1. 1. 1. 1.]
...
...
[ 1. 1. 1. 0.99999 1. 1. 1. 1.]
I have found a similar question on this forum: Neural Network sigmoid function . I did not commit the mistakes pointed out in the Tim's answer. But what i couldn't figure out was this:
Finally, even with these changes, a fully-connected neural network with all positive weights will probably still produce all 1's for the output. You can either include negative weights corresponding to inhibitory nodes, or reduce connectivity significantly (e.g. with a 0.1 probability that a node in layer n connects to a node in layer n+1).
Should i normalize the output after applying sigmoid on convlayer2? or try something else?
EDIT: Input data:
[[ 3. 0. 0. 3. 7. 3. 0. 3. 0. 11. 0. 0.
3. 0. 0. 3. 8. 0. 0. 3. 0. 0. 0. 2.
0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 1. 5. 0. 12. 0.
16. 0. 0. 4. 0. 2. 8. 3. 0. 4. 8. 0.
0. 0. 0. 0.]
[ 0. 0. 2. 0. 0. 0. 1. 2. 1. 12. 0. 8.
0. 0. 6. 0. 11. 0. 0. 6. 7. 2. 0. 0.
0. 0. 0. 0.]
[ 0. 1. 3. 0. 0. 2. 3. 0. 0. 0. 12. 0.
0. 23. 0. 0. 0. 0. 11. 3. 0. 0. 4. 0.
0. 0. 0. 0.]
[ 0. 1. 1. 0. 0. 2. 0. 0. 6. 0. 25. 27.
136. 135. 188. 89. 84. 25. 0. 0. 3. 1. 0. 0.
0. 0. 0. 0.]
[ 4. 0. 0. 0. 0. 0. 0. 0. 3. 88. 247. 236.
255. 249. 250. 227. 240. 136. 37. 1. 0. 2. 2. 0.
0. 0. 0. 0.]
[ 2. 0. 0. 3. 0. 0. 4. 27. 193. 251. 253. 255.
255. 255. 255. 240. 254. 255. 213. 89. 0. 0. 14. 1.
0. 0. 0. 0.]
[ 0. 0. 0. 6. 0. 0. 18. 56. 246. 255. 253. 243.
251. 255. 245. 255. 255. 254. 255. 231. 119. 7. 0. 5.
0. 0. 0. 0.]
[ 4. 0. 0. 12. 13. 0. 65. 190. 246. 255. 255. 251.
255. 109. 88. 199. 255. 247. 250. 255. 234. 92. 0. 0.
0. 0. 0. 0.]
[ 0. 10. 1. 0. 0. 18. 163. 248. 255. 235. 216. 150.
128. 45. 6. 8. 22. 212. 255. 255. 252. 172. 0. 15.
0. 0. 0. 0.]
[ 0. 1. 4. 5. 0. 0. 187. 255. 254. 94. 57. 7.
1. 0. 6. 0. 0. 139. 242. 255. 255. 218. 62. 0.
0. 0. 0. 0.]
[ 5. 2. 0. 0. 11. 56. 252. 235. 253. 20. 5. 2.
5. 1. 0. 1. 2. 0. 97. 249. 248. 249. 166. 8.
0. 0. 0. 0.]
[ 0. 0. 2. 0. 0. 70. 255. 255. 245. 25. 10. 0.
0. 1. 0. 4. 10. 0. 10. 255. 246. 250. 155. 0.
0. 0. 0. 0.]
[ 2. 0. 7. 12. 0. 87. 226. 255. 184. 0. 3. 0.
10. 5. 0. 0. 0. 9. 0. 183. 251. 255. 222. 15.
0. 0. 0. 0.]
[ 0. 5. 1. 0. 19. 230. 255. 243. 255. 35. 2. 0.
0. 0. 0. 9. 8. 0. 0. 70. 245. 242. 255. 14.
0. 0. 0. 0.]
[ 0. 4. 3. 0. 19. 251. 239. 255. 247. 30. 1. 0.
4. 4. 14. 0. 0. 2. 0. 47. 255. 255. 247. 21.
0. 0. 0. 0.]
[ 6. 0. 2. 2. 0. 173. 247. 252. 250. 28. 10. 0.
0. 8. 0. 0. 0. 8. 0. 67. 249. 255. 255. 12.
0. 0. 0. 0.]
[ 0. 0. 6. 3. 0. 88. 255. 251. 255. 188. 21. 0.
15. 0. 8. 2. 16. 0. 35. 200. 247. 251. 134. 4.
0. 0. 0. 0.]
[ 0. 3. 3. 1. 0. 11. 211. 247. 249. 255. 189. 76.
0. 0. 4. 0. 2. 0. 169. 255. 255. 247. 47. 0.
0. 0. 0. 0.]
[ 0. 6. 0. 0. 2. 0. 59. 205. 255. 240. 255. 182.
41. 56. 28. 33. 42. 239. 246. 251. 238. 157. 0. 1.
0. 0. 0. 0.]
[ 2. 1. 0. 0. 2. 10. 0. 104. 239. 255. 240. 255.
253. 247. 237. 255. 255. 250. 255. 239. 255. 100. 0. 1.
0. 0. 0. 0.]
[ 1. 0. 3. 0. 0. 7. 0. 4. 114. 255. 255. 255.
255. 247. 249. 253. 251. 254. 237. 251. 89. 0. 0. 1.
0. 0. 0. 0.]
[ 0. 0. 9. 0. 0. 1. 13. 0. 14. 167. 255. 246.
253. 255. 255. 254. 242. 255. 244. 61. 0. 19. 0. 1.
0. 0. 0. 0.]
[ 2. 1. 7. 0. 0. 4. 0. 14. 0. 27. 61. 143.
255. 255. 252. 255. 149. 21. 6. 16. 0. 0. 7. 0.
0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0.]]
weights for convlayer 1:
[[[-0.01216923 -0.00584966 0.04876327 0.04628595 0.05644253]
[-0.03813031 -0.0304277 0.05728934 -0.01358741 -0.02875361]
[ 0.04929296 0.05958448 0.05497736 0.04699187 -0.04964543]
[ 0.01874465 0.05793848 0.03988833 -0.02355133 -0.05672331]
[ 0.03986748 -0.06098319 0.01299825 -0.00239702 -0.01750711]]]
[[[-0.02474246 0.0423619 -0.02130952 0.00718671 0.02677802]
[ 0.04151089 0.04336411 -0.03549197 -0.01935773 0.04035303]
[ 0.01466489 -0.01117737 0.0081063 0.01310948 0.01900553]
[-0.01723775 0.0148552 -0.03563556 -0.04108806 0.01764391]
[ 0.03932499 -0.00911049 0.00443425 -0.0388128 0.01646769]]
...........
...........
weights at convlayer 2:
[[-0.02894977 -0.00163836 0.0416469 -0.00195158 0.03194728]
[ 0.02618844 -0.00961595 -0.03348994 0.04460359 0.03113144]
[ 0.04166139 -0.02487885 0.02173471 -0.00147136 0.00803713]
[ 0.02262536 -0.03310476 -0.00949261 -0.0450313 0.03128755]
[-0.01181284 0.00558957 -0.02410718 0.01706195 0.01151338]]
[[ 0.04118888 -0.01306432 -0.01013332 0.03423443 0.03135569]
[ 0.00471491 0.02169717 0.00583819 -0.02421325 -0.01708062]
[-0.01244262 -0.00934037 0.00605259 -0.03825137 -0.00606101]
[-0.01699741 0.01311037 0.0307442 0.04153474 -0.00470464]
[-0.02592571 -0.01203504 0.04052782 0.03150989 0.02740532]]
.........
.........
The weights were initialized using Xavier initialization:
n_in=28*28
n_out = 24*24
w_bound = numpy.sqrt(6./float(n_in+n_out))
filters = numpy.random.uniform(-w_bound,w_bound,(40,20, 5,5))
Upvotes: 0
Views: 2035
Reputation: 467
1- Did you normalize the output between 0 and 1? If you didn't, every output which is larger than 1 become 1.
2- Normalize the input data, divide it with 255 which is the largest RGB value.
3- There is definitely a problem in : Output at COnvolutional layer2:
[ 19.88641441 20.11005634 20.04984707 20.19106394 19.93096274 20.1585536 19.84757161 19.79030395]
These numbers can't be possible, how do you initialize your convlayers? They should be initialized between 0 and 1.
Also, you should normalize the Convs. To do this, and I'm assuming that you're doing this implementation for educational purposes, apply sigmoid to the outputs of the all convs, these way your conv values doesn't go up. Normally, RELU activation gives better results with convolutional layers, however you can get good results with sigmoid too.
Upvotes: 0