daifeng
daifeng

Reputation: 11

Why Keras behave better than Pytorch under the same network configuration?

Recently, I have compared unet++ implementation of Keras version and Pytorch version on the same dataset. However, with Keras the loss decrease continuously and the accuracy is higher after 10 epochs, while with Pytorch the loss decrease unevenly and the accuracy is lower after 10 epochs. Anyone has met such problems and has any answers?

the final pytorch training process is like:

2019-12-15 18:14:20 Epoch:9 Iter: 1214/1219 loss:0.464673 acc:0.581713

2019-12-15 18:14:21 Epoch:9 Iter: 1215/1219 loss:0.450462 acc:0.584101

2019-12-15 18:14:21 Epoch:9 Iter: 1216/1219 loss:0.744811 acc:0.293406

2019-12-15 18:14:22 Epoch:9 Iter: 1217/1219 loss:0.387612 acc:0.735630

2019-12-15 18:14:23 Epoch:9 Iter: 1218/1219 loss:0.767146 acc:0.364759

the final keras training process is like:

685/690 [============================>.] - ETA: 2s - loss: 0.4940 - acc: 0.7309

686/690 [============================>.] - ETA: 1s - loss: 0.4941 - acc: 0.7306

687/690 [============================>.] - ETA: 1s - loss: 0.4939 - acc: 0.7308

688/690 [============================>.] - ETA: 0s - loss: 0.4942 - acc: 0.7303

689/690 [============================>.] - ETA: 0s - loss: 0.4943 - acc: 0.7302

Upvotes: 1

Views: 917

Answers (1)

Separius
Separius

Reputation: 1296

Well, it's pretty hard to say without any code snippets. that being said, in general, initialization is way more important than you might think. I'm sure that the default initialization of pytorch is different from keras and I had similar issues in the past.

Another thing to check is the optimizer parameters, make sure that not only you are using the same optimizer(sgd, adam, ...) but also with the same parameters(lr, beta, momentum, ...)

Upvotes: 1

Related Questions