Emil
Emil

Reputation: 725

MXNet AdamW optimizer

Adam optimizer has flaws when used with weight decay. In 2018, AdamW optimizer has been proposed.

Is there any standard way to implement AdamW in MXNet framework (python implementation)? There is mxnet.optimizer.Adam class, but no mxnet.optimizer.AdamW one (checked in mxnet-cu102==1.6.0, mxnet==1.5.0 package versions).

P.S. I asked this questions on MXNet forum and on datascience.stackexchange.com, but to no avail.

Upvotes: 1

Views: 357

Answers (1)

Alex I
Alex I

Reputation: 20287

Short answer: There isn't a standard way to use AdamW in Gluon yet, but there is some existing work in that direction that would make that relatively easy to add.

Longer answer:

Please let me know if you get this working, as I'd love to be able to use that as well.

Upvotes: 1

Related Questions