Cade Karrenberg
Cade Karrenberg

Reputation: 53

Fixed effects 2sls with Python linearmodels- memory error

I'm trying to run a 2SLS to estimate price elasticity with linearmodels IV2SLS. This is what my data looks like:
| ln_q | ln_p | .... weather variables ... | ... instruments ... |... user id dummies ...|

all data is np.float32. My data array is approx. (200000, 20000) which is about 16GB.

Using linearmodels IV2SLS I set up my model like:

dependent = ln_q
endog = weather variables + user id dummies
exog = ln_p
instruments = instruments
results = IV2SLS(dependent, endog, exog, instruments).fit()

When running with the full dataset I consistently get the error:
Unable to allocate 27.8GiB of memory to an array with shape (202507, 18450) and data type float64
I'm running 64-bit python on a machine with 128 GB of RAM. I've tried to circumvent this issue by passing my own weights:
results = IV2SLS(dependent, endog, exog, instruments, weights=np.ones(dependent.shape, dtype=np.float32)).fit()
but still get the same MemoryError.
Why is an array of ~16GB using >100GB of RAM in this process? What can I do to get this regression to run?

Upvotes: 1

Views: 49

Answers (0)

Related Questions