Reputation: 48483
I have a Ruby On Rails on a Ubuntu server - the server is in the AWS infrastructure.
The app is running there for 4 years and everything was fine until last week, when I started receiving these (and similar) error messages:
Errno::ENOSPC: No space left on device @ io_write - /home/deployer/apps/myapp-production/shared/log/unicorn.stderr.log
I had to log in on the server and empty the log files - >unicorn.stderr.log
. Now, this error occurs every (or every other) day.
This is how the log files in my Rails app look like:
drwxrwxr-x 2 deployer deployer 4096 Sep 28 06:28 .
drwxrwxr-x 11 deployer deployer 4096 Jun 11 2016 ..
-rw-rw-r-- 1 deployer deployer 0 Sep 9 2017 newrelic_agent.log
-rw-rw-r-- 1 deployer deployer 0 Sep 19 16:22 newrelic_agent.log.1
-rw-rw-r-- 1 deployer deployer 0 Sep 19 16:23 newrelic_agent.log.2.gz
-rw-rw-r-- 1 deployer deployer 0 Sep 19 16:23 newrelic_agent.log.3.gz
-rw-rw-r-- 1 deployer deployer 0 Sep 19 16:23 newrelic_agent.log.4.gz
-rw-rw-r-- 1 deployer deployer 0 Sep 19 16:23 newrelic_agent.log.5.gz
-rw-rw-r-- 1 deployer deployer 0 Sep 19 16:23 newrelic_agent.log.6.gz
-rw-rw-r-- 1 deployer deployer 0 Sep 19 16:23 newrelic_agent.log.7.gz
-rw-rw-r-- 1 deployer deployer 0 Feb 20 2018 procat
-rw-rw-r-- 1 deployer deployer 12480512 Sep 28 21:12 production.log
-rw-rw-r-- 1 deployer deployer 71216391 Sep 28 06:28 production.log.1
-rw-rw-r-- 1 deployer deployer 20 Sep 27 12:22 production.log.2.gz
-rw-rw-r-- 1 deployer deployer 20 Sep 26 15:27 production.log.3.gz
-rw-rw-r-- 1 deployer deployer 0 Sep 26 15:28 production.log.4.gz
-rw-rw-r-- 1 deployer deployer 0 Sep 26 15:28 production.log.5.gz
-rw-rw-r-- 1 deployer deployer 0 Sep 26 15:28 production.log.6.gz
-rw-rw-r-- 1 deployer deployer 0 Sep 26 15:28 production.log.7.gz
-rw-rw-r-- 1 deployer deployer 1391716 Sep 28 21:11 skylight.log
-rw-rw-r-- 1 deployer deployer 734536 Sep 28 06:28 skylight.log.1
-rw-rw-r-- 1 deployer deployer 20 Sep 27 12:23 skylight.log.2.gz
-rw-rw-r-- 1 deployer deployer 20 Sep 26 15:28 skylight.log.3.gz
-rw-rw-r-- 1 deployer deployer 0 Sep 26 15:29 skylight.log.4.gz
-rw-rw-r-- 1 deployer deployer 0 Sep 26 15:29 skylight.log.5.gz
-rw-rw-r-- 1 deployer deployer 0 Sep 26 15:29 skylight.log.6.gz
-rw-rw-r-- 1 deployer deployer 0 Sep 26 15:29 skylight.log.7.gz
-rw-rw-r-- 1 deployer deployer 0 May 20 2018 staging.log
-rw-rw-r-- 1 deployer deployer 0 Oct 4 2016 unicorn.log
-rw-rw-r-- 1 deployer deployer 1 Oct 4 2016 unicorn.log.1
-rw-rw-r-- 1 deployer deployer 20480 Sep 28 21:13 unicorn.stderr.log
This is what df -H
says before emptying the log files:
Filesystem Size Used Avail Use% Mounted on
udev 2.0G 13k 2.0G 1% /dev
tmpfs 395M 373k 395M 1% /run
/dev/xvda1 17G 16G 0 100% /
none 4.1k 0 4.1k 0% /sys/fs/cgroup
none 5.3M 0 5.3M 0% /run/lock
none 2.0G 0 2.0G 0% /run/shm
none 105M 0 105M 0% /run/user
And this is after emptying them:
Filesystem Size Used Avail Use% Mounted on
udev 1.9G 12K 1.9G 1% /dev
tmpfs 377M 364K 377M 1% /run
/dev/xvda1 16G 15G 437M 98% /
none 4.0K 0 4.0K 0% /sys/fs/cgroup
none 5.0M 0 5.0M 0% /run/lock
none 1.9G 0 1.9G 0% /run/shm
none 100M 0 100M 0% /run/user
The app is pretty consistent, it's not that the traffic increased 10x recently.
I was also trying to look at the biggest files on the server - find / -size +100M
- and this is the output:
/var/log/nginx/myapp_production.access.log.1
/var/log/nginx/myapp_production.access.log
/var/log/btmp.1
find: `/var/spool/rsyslog': Permission denied
find: `/var/spool/cron/atjobs': Permission denied
find: `/var/spool/cron/crontabs': Permission denied
find: `/var/spool/cron/atspool': Permission denied
find: `/var/cache/ldconfig': Permission denied
find: `/var/lib/polkit-1': Permission denied
find: `/var/lib/monit/events': Permission denied
find: `/var/lib/nginx/scgi': Permission denied
find: `/var/lib/nginx/body': Permission denied
find: `/var/lib/nginx/uwsgi': Permission denied
find: `/var/lib/nginx/fastcgi': Permission denied
find: `/var/lib/nginx/proxy': Permission denied
find: `/var/lib/sudo': Permission denied
find: `/etc/ssl/private': Permission denied
find: `/etc/chatscripts': Permission denied
find: `/etc/polkit-1/localauthority': Permission denied
find: `/etc/ppp/peers': Permission denied
find: `/etc/sudoers.d': Permission denied
find: `/root': Permission denied
find: `/run/user/1003': Permission denied
...
I looked at these two files - /var/log/nginx/myapp_production.access.log.1
and /var/log/nginx/myapp_production.access.log
and the sizes are 219M
and 248MB
.
The /var/log/btmp.1
file has 330MB
. Can I delete this one?
If I display the size of the /var/log/nginx
directory - du -hs .
- the size is 846MB
. Can I empty log files in this directory without affecting the functionality of the application?
Also, any idea why have I suddenly starting running out of free disk space on the server? How should I "debugging" this situation?
Thank you in advance.
Upvotes: 0
Views: 256
Reputation: 1404
I personally like to use the ncdu
tool to find the files in the system that are taking up the most space.
Once you have it installed you just type ncdu <directory>
at the command line and get a nice interface with directories sorted largest to smallest of the directory you want to investigate. You can navigate down into each directory and find the files taking up the most space.
If your server has been up for four years, it would be inevitable that the 17G hard drive would eventually get used up by server logs. You probably have a bunch of nginx logs and application server logs that you could delete.
As Jon suggested, depending on your needs, you could set up rotating logs to prevent the server from filling up. You could also double the size of your storage space if you want to store all that history.
Upvotes: 1