Reputation: 735
As the title suggests I'm running jupyter in a docker container and I'm getting the OSError from python deep in the scikit learn/numpy library at the following line:
pickler.file_handle.write(chunk.tostring('C'))
I've done some troubleshooting, and most of the problems people seem to have is their hard drive or RAM actually running out of space, which isn't the case for me AFAIK.
This is what my df looks like:
Filesystem 1K-blocks Used Available Use% Mounted on
udev 16419976 0 16419976 0% /dev
tmpfs 3288208 26320 3261888 1% /run
/dev/sdb7 125996884 72177548 47395992 61% /
tmpfs 16441036 238972 16202064 2% /dev/shm
tmpfs 5120 4 5116 1% /run/lock
tmpfs 16441036 0 16441036 0% /sys/fs/cgroup
/dev/sdb2 98304 32651 65653 34% /boot/efi
tmpfs 3288208 68 3288140 1% /run/user/1000
//192.168.1.173/ppo-server3 16864389368 5382399064 11481990304 32% /mnt/ppo-server3
This is what my free looks like:
total used free shared buff/cache available
Mem: 32882072 7808928 14265280 219224 10807864 24357276
Swap: 976892 684392 292500
Am I looking at the right df and free outputs? Both of them are being run from a bash instance inside the container.
Upvotes: 38
Views: 39680
Reputation: 9908
Docker leaves dangling images around that can take up your space. To clean up after docker, run the following:
docker system prune -af
We can use the ‘until’ keyword with the ‘–filter’ option to remove objects that are created before a given timestamp or duration as shown below (objects older than 2 minutes):
docker system prune -a --filter “until=2m”
or in older versions of docker:
docker rm $(docker ps -q -f 'status=exited')
docker rmi $(docker images -q -f "dangling=true")
This will remove exited and dangling images, which hopefully clears out device space.
Meta: Putting this answer here because it's the top stack-overflow result for that failure and this is a possible fix for it.
Upvotes: 71
Reputation: 41
i have the same problem when run parallel process in a Docker. The trouble is that by default some precess use /dev/shm for storing some cache data and the size of that location in linux is by default about 64MB. You can change de route where your parallel jobs store the cache with this two codelines in python. If the person that is reading this is using Pandarallel this solution can will help you.
import os
os.environ['JOBLIB_TEMP_FOLDER'] = '/tmp'
For users of pandarallel add this line too
pandarallel.initialize(use_memory_fs = False )
Upvotes: 2
Reputation: 967
If it helps anyone, I received the same error and the problem was that one of my apps log files (laravel.log
) was almost 11GB in size. Deleting that file resolved my problem.
Upvotes: 0
Reputation: 3322
As mentioned in the coment by @PeerEZ , this happens when sklearn attempts to parallelize jobs.
sklearn attempts to communicate between processes by writing to /dev/shm, which is limited to 64mb on docker containers.
You can try running with n_jobs=1 as suggested by @PeerEZ (if you can't restart the container), or if parallellization is required, try running the container using the --shm-size
option to set a bigger size for /dev/shm . Eg. -
docker run --shm-size=512m <image-name>
Upvotes: 27