volodymyr
volodymyr

Reputation: 7544

pymc3 SQLite backend, specify list of variables to track

I am fitting a hierarchical model where one variable has a shape>10K and the model requires 500+k samples to converge. I would like to use a persistent backend for trace, so that I can compare different models later. I tried to use SQLite backend, I got the following error:

/opt/conda/lib/python2.7/site-packages/pymc3/backends/sqlite.pyc in _create_table(self) 123 statement = template.format(table=varname, 124 value_cols=colnames) --> 125 self.db.cursor.execute(statement) 126 127 def _create_insert_queries(self, chain):

OperationalError: too many columns on individual_freq

I assume this is because I am trying to save a trace for all my vars, including a vector variable with shape>10K. I don't need/want to save trace for a vector - I am only interested in top-level variables. When using memory backend, I am able to specify the list of variables explicitly like this:

trace = pm.sample(1000000, step, start=start, progressbar=False,   
                  trace=[alpha,beta,uplift,mo_drop])

But when using SQLite, I can only specify:

backend = SQLite('beta_poisson_monthly_drop.sqlite')
trace = pm.sample(1000000, step, progressbar=False,
                  trace=backend)

What I want to do, is something like this:

backend = SQLite('beta_poisson_monthly_drop.sqlite')
trace = pm.sample(1000000, step, progressbar=False,
                  trace=backend, vars=[alpha,beta,uplift,mo_drop])

Is it possible? Should this be a feature request? Thanks for any advice.

Upvotes: 3

Views: 444

Answers (1)

volodymyr
volodymyr

Reputation: 7544

There is a vars argument in SQLite init method:

backend = SQLite('beta_poisson_monthly_drop.sqlite',
                 vars=[alpha,beta,uplift,mo_drop])
trace = pm.sample(1000000, step, progressbar=False, trace=backend)

Upvotes: 3

Related Questions