frisbee
frisbee

Reputation: 73

Regression with “unidirectional” noise

I would like to estimate the parameters of a simple linear function and a gamma-distributed noise term from data. (Note: This is a follow-up question of https://stats.stackexchange.com/questions/88676/regression-with-unidirectional-noise, but simplified and more implementation-specific). Say I have my observed data generated as follows:

import numpy as np
np.random.seed(0)

size = 200
true_intercept = 1
true_slope = 2

# Generate observed data
x_ = np.linspace(0, 1, size)
true_regression_line = true_intercept + true_slope * x_  # y = a + b*x
noise_ = np.random.gamma(shape=1.0, scale=1.0, size=size)
y_ = true_regression_line + noise_

which looks as follows: enter image description here

I tried estimating these parameters using pymc as follows:

from pymc import Normal, Gamma, Uniform, Model, MAP
# Define priors
intercept = Normal('intercept', 0, tau=0.1)
slope = Normal('slope', 0, tau=0.1)
alpha = Uniform('alpha', 0, 2)
beta = Uniform('beta', 0, 2)
noise = Gamma('noise', alpha=alpha, beta=beta, size=size)

# Give likelihood > 0 to models where the regression line becomes larger than
# any of the datapoint
y = Normal('y', mu=intercept + slope * x_ + noise, tau=100,
           observed=True, value=y_)

# Perform MAP fit of model
model = Model([alpha, beta, intercept, slope, noise])
map_ = MAP(model)
map_.fit()

However, this gives me estimates that are far off the true values:

Am I doing something wrongly?

Upvotes: 0

Views: 197

Answers (1)

Chris Fonnesbeck
Chris Fonnesbeck

Reputation: 4203

You seem to be specifying a Normal likelihood as well as the Gamma noise, so you are adding additional Gaussian noise to the model, which does not seem to be warranted. Try expressing the likelihood as a Gamma, rather than a Normal, since this is the distribution of the residuals.

Upvotes: 0

Related Questions