Mir Henglin
Mir Henglin

Reputation: 669

Broadcasting mathematical operations with PYMC3/Theano

I think this issue boils down to my lack of understanding with Theano works. I'm in a situation where I want to create a variable that is the result of a subtraction between a distribution and a numpy array. This works fine when I specify the shape parameter as 1

import pymc3 as pm
import numpy as np
import theano.tensor as T

X = np.random.randint(low = -10, high = 10, size = 100)

with pm.Model() as model:
    nl = pm.Normal('nl', shape = 1)
    det = pm.Deterministic('det', nl - x)

nl.dshape
(1,)

However, this breaks when I specify shape > 1

with pm.Model() as model:
    nl = pm.Normal('nl', shape = 2)
    det = pm.Deterministic('det', nl - X)

ValueError: Input dimension mis-match. (input[0].shape[0] = 2, input[1].shape[0] = 100)

nl.dshape
(2,)

X.shape
(100,)

I tried transposing X to make it broadcastable

X2 = X.reshape(-1, 1).transpose()

X2.shape
(1, 100)

But now it declares a mismatch at .shape[1] instead of .shape[0]

with pm.Model() as model:
    nl = pm.Normal('nl', shape = 2)
    det = pm.Deterministic('det', nl - X2)

ValueError: Input dimension mis-match. (input[0].shape[1] = 2, input[1].shape[1] = 100)

I can make this work if I loop over the elements of the distribution

distShape = 2
with pm.Model() as model:
    nl = pm.Normal('nl', shape = distShape)

    det = {}
    for i in range(distShape):
        det[i] = pm.Deterministic('det' + str(i), nl[i] - X)

det
{0: det0, 1: det1}

However this feels inelegant and constrains me to use loops for the rest of the model. I was wondering if there was a way to specify this operation so that it could work the same as with distributions.

distShape = 2
with pm.Model() as model:
    nl0 = pm.Normal('nl1', shape = distShape)
    nl1 = pm.Normal('nl2', shape = 1)

    det = pm.Deterministic('det', nl0 - nl1)

Upvotes: 1

Views: 754

Answers (1)

aloctavodia
aloctavodia

Reputation: 2070

You can do

X = np.random.randint(low = -10, high = 10, size = 100)
X = x[:,None] # or x.reshape(-1, 1)

and then

with pm.Model() as model:
    nl = pm.Normal('nl', shape = 2)
    det = pm.Deterministic('det', nl - X)

In this case the shapes of nl and X will be ((2, 1), (100,)), respectively and then broadcastable.

Notice we get the same behavior with two NumPy arrays (not only one Theano tensor and one NumPy array)

a0 = np.array([1,2])
b0 = np.array([1,2,3,5])
a0 = a0[:,None]  # comment/uncomment this line
print(a0.shape, b0.shape)
b0-a0

Upvotes: 3

Related Questions