Inexplicable behaviour when using numpy.T as init for pyTorch weights

Question

I use numpy to init the weights of my PyTorch MLP. It's a really small network, 2 layers, 21 neurons per layer. The network's output is BRDF values that are then rendered by Mitsuba 0.6.0.

The very peculiar and strange issue I am experiencing is when transposing the np-arrays during the initialization phase. Doing version A gives me a network that renders perfectly in Mitsuba (what I would expect). Doing version B, which should be equivalent, gives me a network that scores the same loss in PyTorch, but renders different values in Mitsuba.

# Version A: 
w = np.random.uniform(low=-0.05, high=0.05, size=(6, 21)).astype(np.float32)
model.fc1.weight = torch.nn.Parameter(torch.from_numpy(w.T), requires_grad=True)

# Version B: 
w = np.random.uniform(low=-0.05, high=0.05, size=(21, 6)).astype(np.float32)
model.fc1.weight = torch.nn.Parameter(torch.from_numpy(w), requires_grad=True)

Note how in Version B, all that changed are the dimensions and the call to transpose. Therefore, the shapes are equivalent to Version A, and the contents should be equivalent as well, as both are sampled from the same distribution.

I cannot share a MWE, as this is proprietary research, but I assure you that the ONLY thing I changed between these two runs is the two lines in the above code snippets. I do not think Mitsuba is at fault either, because the first network (version A) renders fine, and the second network is equivalent to that, but for the init. I tried mimicking the numpy-inits with the respective PyTorch-equivalents, and the issue persists.

Any help is greatly appreciated!!

VersionA VersionB

Inexplicable behaviour when using numpy.T as init for pyTorch weights

Answers (0)

Related Questions