Hierarchical varying effects model with MVN prior

Question

What I'm trying to do

I've already dealt with multivariate priors in pymc (I'm using 4.0.1), but I can't get their usage in a hierarchical model working. In my example I'm modeling a regression problem with two covariates x1, x2 and an outcome y. There are two categorical features in the data that define hierarchy levels: dim1 indicates the higher level category and dim2 the lower level category.

The model consists of an intercept and a slope each for x1 and x2 respectively. There's a multivariate normal hyperprior B_1 that defines varying intercepts and slopes by dim1 in order [intercept, slope_x1, slope_x2] . These then inform the prior B_2 on the lower level, where the terms vary also by dim2.

Problem

When trying to sample from the model, I'm getting the following error: "ValueError: Invalid dimension for value: 3".
I interpret this as the '3' in the dims for B_1 and B_2 being wrong, but when evaluating the model variables manually everything seems to be fine:

B_2.eval().shape correctly shows that the rv is of shape (6,2,3).
The linear model in mu can also be evaluated without issue.

Checking the error traceback mentions this as the issue:

/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/pymc/distributions/multivariate.py in quaddist_parse(value, mu, cov, mat_type)
    133     """Compute (x - mu).T @ Sigma^-1 @ (x - mu) and the logdet of Sigma."""
    134     if value.ndim > 2 or value.ndim == 0:
--> 135         raise ValueError("Invalid dimension for value: %s" % value.ndim)
    136     if value.ndim == 1:
    137         onedim = True

Which means that the dimensionality can't be larger than 2. This seems odd, because multivariate distributions with more than 2 dimensions are pretty common. That leads me to believe that the way I'm specifying the MVN priors is just wrong, but after much trying I'm pretty stuck. What's the correct way of specifying B_1 and B_2 here?

Code for model

import numpy as np
import pymc as pm
with pm.Model(coords={'dim1':np.arange(2), 'dim2':np.arange(6)}) as mdl:
    d1 = pm.MutableData('d1', X['d1'].values, dims='obs_id')
    d2 = pm.MutableData('d2', X['d2'].values, dims='obs_id')
    # upper level
    sd_1 = pm.HalfNormal.dist(1,shape=3)
    chol_1, _, _ = pm.LKJCholeskyCov('chol_1', n=3, eta=1,
        sd_dist=sd_1, compute_corr=True)
    B_1 = pm.MvNormal('B_1', mu=[5,0,0], chol=chol_1, 
                        dims=('dim1','3')
                        )
    # lower level
    sd_2 = pm.HalfNormal.dist(1,shape=3)
    chol_2, _, _ = pm.LKJCholeskyCov('chol_2', n=3, eta=1,
        sd_dist=sd_2, compute_corr=True)
    B_2 = pm.MvNormal('B_2', mu=B_1, chol=chol_2, 
                        dims=('dim2','dim1','3')
                        )
    # regular robust regression, mu defined as linear model of covariates
    sigma = pm.HalfNormal('sigma',2)
    mu = pm.Deterministic(
        'mu', 
        B_2[d2,d1,0] + # intercept
        B_2[d2,d1,1] * X['x1'].values + # slope x1
        B_2[d2,d1,2] * X['x2'].values, # slope x2
        dims='obs_id'
    )
    outcome = pm.StudentT('outcome', nu=3, mu=mu, sigma=sigma, observed=y, dims="obs_id")
    trace = pm.sample(draws=2000, tune=1000, target_accept=0.95, random_seed=0)

Code for data generation

import numpy as np
import pandas as pd
from scipy import stats
import pymc as pm

rng = np.random.default_rng(0)
N = 1000
# generate covariates and categories
X = pd.DataFrame(
    {
        'x1': stats.halfnorm(loc=0,scale=3).rvs(N),
        'x2': stats.norm(loc=0,scale=2).rvs(N),
        'd1': rng.choice([0,1],size=N, p=[0.6,0.4]),
        'd2': rng.choice(np.arange(6),size=N, p=[0.1,0.2,0.1,0.3,0.1,0.2]),
    }
)
# means of the parameter distributions
intercept = np.array([
    [5,5,6,7,6,5],
    [4,4,5,6,6,4]
])
slope1 = np.array([
    [0,0.7,0.3,-0.2,-1,0],
    [0.5,1,-1,0.6,-0.2,0.3]
])
slope2 = np.array([
    [0,0.7,0.3,-0.2,-1,0],
    [0.5,1,-1,0.6,-0.2,0.3]
])*1.5
# generate some random covariance matrices
corrs = []
for _ in np.arange(6):
    _,corr,_ = pm.LKJCholeskyCov.dist(eta=1,n=3,sd_dist=pm.HalfNormal.dist(1,shape=3), compute_corr=True)
    corrs.append(corr.eval())

# generate outcome 
y = np.zeros(N)
for d1 in [0,1]:
    for d2 in np.arange(6):
        ind = (X['d1']==d1)&(X['d2']==d2)
        mv = stats.multivariate_normal(mean=[intercept[d1,d2], slope1[d1,d2],slope2[d1,d2]],cov=corrs[d2]).rvs(1)
        y[ind] = mv[0] + X.loc[ind,'x1']*mv[1] + X.loc[ind,'x2']*mv[2] + rng.normal(loc=0,scale=1,size=ind.sum())

Hierarchical varying effects model with MVN prior

What I'm trying to do

Problem

Code for model

Code for data generation

Answers (1)

Related Questions