Reputation: 53
I'm trying to run a 2SLS to estimate price elasticity with linearmodels IV2SLS. This is what my data looks like:
| ln_q | ln_p | .... weather variables ... | ... instruments ... |... user id dummies ...|
all data is np.float32
. My data array is approx. (200000, 20000) which is about 16GB.
Using linearmodels IV2SLS I set up my model like:
dependent = ln_q
endog = weather variables + user id dummies
exog = ln_p
instruments = instruments
results = IV2SLS(dependent, endog, exog, instruments).fit()
When running with the full dataset I consistently get the error:
Unable to allocate 27.8GiB of memory to an array with shape (202507, 18450) and data type float64
I'm running 64-bit python on a machine with 128 GB of RAM.
I've tried to circumvent this issue by passing my own weights:
results = IV2SLS(dependent, endog, exog, instruments, weights=np.ones(dependent.shape, dtype=np.float32)).fit()
but still get the same MemoryError.
Why is an array of ~16GB using >100GB of RAM in this process?
What can I do to get this regression to run?
Upvotes: 1
Views: 49