lbernal
lbernal

Reputation: 23

"Failed TCP accept: emfile" in Ejabberd

I'm running Ejabberd 15.04 with Erlang/OTP 17 and using Ubuntu [64-bit] in all my 4 instances of EC2 Amazon, where I run ejabberd. Ejabberd was installed from source.

I've configured it with 65,535 file descriptors and the following configuration:

ERL_MAX_PORTS=360,000

ERL_PROCESSES=15,000,000

ERL_MAX_ETS_TABLES=100,000

The thing is that suddenly, the servers stopped working and I found this in the logs, hundreds of times:

2016-05-09 13:22:45.901 [error] <0.397.0>@ejabberd_listener:accept:317 (#Port<0.4197>) Failed TCP accept: emfile

I have made my own modules and run ejabberd in a cluster of 4 erlang nodes, behind an Elastic Load Balancer (of amazon). The machines have 4 cores and 30GB ram each. I've migrated roster module to ODBC (MariaDB, similar to mysql). And 80k users are connected concurrently.

I think that the file descriptors are high enough and erlang processes and ports too. The error appeared suddenly, the servers worked fine for 3 weeks. Maybe the cause has to do with mysql? If you please know what the cause may be, I would be very greatful.

Thanks in advance.

Upvotes: 0

Views: 408

Answers (1)

reith
reith

Reputation: 2088

You need to increase limit of open file descriptors, you can get current maximum in Erlang by:

proplists:get_value(max_fds, erlang:system_info(check_io)).

If It's not same as the value set in OS and your software being started by an upstart script, make sure you set file descriptor limit in upstart script by using limit stanza:

limit nofile 65535 65535

Upvotes: 0

Related Questions