elpres
elpres

Reputation: 416

How to apply a custom function to a variable in PyMC?

At one step in the model I'm writing, i have to calculate the error function of a quantity. What I'm trying to do looks like this:

from math import erf
import numpy as np
import pymc as pm

sig = pm.Exponential('sig', beta=0.1, size=10)
x = erf(sig ** 2)

This fails because erf doesn't work on arrays. I tried:

@pm.deterministic
def x(sig=sig):
    return [erf(s) for s in sig]

but with no success, I know it's possible to get the result with:

np_erf = np.vectorize(erf)
x = np_erf((sig ** 2).value)

but this doesn't seem like the correct way because it doesn't produce a pm.Deterministic but just a np.array. How can I do it instead? (PyMC is version 2.3)


Edit: The above examples were simplified for clarity, here's the what the relevant passages look in the real code. Ideally, I would like this to work:

mu = pm.LinearCombination('mu', [...], [...])
sig2 = pm.exp(mu) ** 2
f = 1 / (pm.sqrt(np.pi * sig2 / 2.0) * erf(W / sig2))

but if fails with the message TypeError: only length-1 arrays can be converted to Python scalars. Going the np.vectorize route

np_erf = np.vectorize(erf)
f = 1 / (pm.sqrt(np.pi * sig2 / 2.0) * np_erf(W / sig2))

crashes with the same error message. The list comprehension

@pm.deterministic
def f(sig2=sig2):
    return [1 / (pm.sqrt(np.pi * s / 2.0) * erf(W / s)) for s in sig2]

works as such, but leads to an error later in the code at this spot:

@pm.observed(plot=True)
def y(value=df['dist'], sig2=sig2, f=f):
    return (np.log(np.exp(-(value ** 2) / 2.0 / sig2) * f)).sum()

and the error is AttributeError: log.

I've got the calculation of the error function working using a numerical approximation, which should mean that the general setup is correct. It would be just nicer and clearer to use the erf function directly.

Upvotes: 2

Views: 1159

Answers (1)

elpres
elpres

Reputation: 416

I found the solution. I didn't realize that if you create a variable using the pymc.deterministic decorator, the parameters passed to the function are numpy.array, and not pymc.Distribution. And this allows to numpy.vectorize the function and apply it to the variable. So instead of

sig = pm.Exponential('sig', beta=0.1, size=10)
x = erf(sig ** 2)

you need to use

sig = pm.Exponential('sig', beta=0.1, size=10)
np_erf = np_vectorize(erf)

@pm.deterministic
def x(sig=sig):
    return np_erf(sig ** 2)

and it works.

Upvotes: 2

Related Questions