Reputation: 153
Memory error occurs in amazon sagemaker when preprocessing 2 gb of data which is stored in s3. No problem in loading the data. Dimension of data is 7 million rows and 64 columns. One hot encoding is also not possible. Doing so results in memory error. Notebook instance is ml.t2.medium. How to solve this issue?
Upvotes: 15
Views: 21252
Reputation: 1109
I'm sure you've your answer by now and it would be interesting to understand that. But when I came across a similar issue, these are the things that we did and it slightly improved the performance.
In retrospect, loading the whole data span was not the best way of doing it and the major step up was using batch processing.
Upvotes: 0
Reputation: 2729
I assume you're processing on the data on the notebook instance, right? t2.medium has only 4GB of RAM, so it's quite possible you're simply running out of memory.
Have you tried a larger instance? The specs are here: https://aws.amazon.com/sagemaker/pricing/instance-types/
Upvotes: 7
Reputation: 163
Can you cut a AWS forum post under, https://forums.aws.amazon.com/forum.jspa?forumID=285? with your question. That way, SageMaker team would be able to help you out.
Upvotes: 0