I am studying how to train a normalizing flow model from the below tutorial, https://uvadlc-notebooks.readthedocs.io/en/latest/tutorial_notebooks/tutorial11/NF_image_modeling.html#Normalizing-Flows-on-images In Dequantization part, the log determinant jacobian (ldj) is being calculated as 0 -log(256) * (28*28*1) . I cannot understand this ldj calculation. How are ldj set to 0 initially, and why the ldj is being calculated in this way?

pythonimage-processingcomputer-visiongenerative-adversarial-networkimage-generation

kelvin.aaa2

Reputation: 135

log determinant jacobian in Normalizing Flow training with Pytorch

I am studying how to train a normalizing flow model from the below tutorial,

https://uvadlc-notebooks.readthedocs.io/en/latest/tutorial_notebooks/tutorial11/NF_image_modeling.html#Normalizing-Flows-on-images
In Dequantization part, the log determinant jacobian (ldj) is being calculated as 0 -log(256) * (28*28*1).
- I cannot understand this ldj calculation. How are ldj set to 0 initially, and why the ldj is being calculated in this way?

Upvotes: 0

Answers (1)

Butters

Reputation: 95

Note that 1-sel.alpha is the derivative of the scaling operation, thus the Jacobian of this operation is a diagonal matrix with z.shape[1:] entries on the diagonal, thus the Jacobian determinant is simply the product of these diagonal entries which gives rise to

ldj += np.log(1-self.alpa) * np.prod(z.shape[1:])

the second line accounts for the log determinant of the sigmoid $s(z)$ function as $s'(z)=s(z)(1-s(z))$. So the two lines result from the application of the chain rule which turns into a sum when taking the logarithm.

Setting ldj = torch.zeros(1,) is just the initialization of this variable - its value will be only updated in the module. Not sure what the motivation is, but it could be that they want to apply the dequant_module for each individual sample in the batch.

Upvotes: 1

log determinant jacobian in Normalizing Flow training with Pytorch

Answers (1)

Related Questions