Reputation: 31497
Everyday I have a new Erlang crash report on my server. As ejabberd is the only Erlang-thing I use, this must be the cause of the crash.
The logfile (erl_crash.dump
) has almost 9,000 lines so I have no idea how to debug that. But when I searched for "ejabberd" in that logfile, there were 5 occurrences - and every single occurrence was something related to "ejabberdctl".
I'm addressing ejabberdctl via PHP script (exec()
) to programatially create users. Could that be the cause for the crash (somehow)?
In /var/log/ejabberd
directory, I've found some errors in erlang.log
and ejabberd.log
. But I don't really know how to resolve them:
=ERROR REPORT====
Mnesia('ejabberd@MYHOST'): ** ERROR ** (core dumped to file: "/var/lib/ejabberd/MnesiaCore.ejabberd@MYHOST_...")
** FATAL ** mnesia_monitor crashed: {badarg,
[{ets,lookup,
[mnesia_decision,
'ejabberdctl@MYHOST']},
{mnesia_recover,has_mnesia_down,1},
{mnesia_monitor,handle_info,2},
{gen_server,handle_msg,5},
{proc_lib,init_p_do_apply,3}]} state: {state,
<0.65.0>,
[],
[],
true,
[],
undefined,
[]}
=ERROR REPORT====
Mnesia('ejabberd@MYHOST'): ** WARNING ** Mnesia is overloaded: {dump_log,
time_threshold}
=CRASH REPORT====
crasher:
initial call: ejabberd_listener:init/3
pid: <0.366.0>
registered_name: []
exception exit: {timeout,
{gen_server,call,
[<0.682.0>,{become_controller,<0.685.0>}]}}
in function gen_server:call/2
in call from ejabberd_listener:accept/3
ancestors: [ejabberd_listeners,ejabberd_sup,<0.39.0>]
messages: [{#Ref<0.0.0.11304>,ok}]
links: [#Port<0.2761>,<0.274.0>]
dictionary: []
trap_exit: false
status: running
heap_size: 2584
stack_size: 24
reductions: 20938
neighbours:
Upvotes: 4
Views: 3128
Reputation: 9055
You can only execute ejabberdctl once. Executing it twice from your PHP will generate conflict in node naming and the crash you see.
Do not use ejabberdctl from code, but rely on API.
Upvotes: 2
Reputation: 1
You could use ssh port forwarding to export webtool to your local machine, where you can point a browser at it. Exposing it to the whole internet would probably be a bad security error.
Upvotes: 0
Reputation: 551
Do you have an erlang.log log file? If so, you should find good info in there about a crash.
Upvotes: 0
Reputation: 606
The erl_crash.dump
file contains the states of almost everything in the moment when the Erlang VM crashed. There's a tool for analyzing it, just:
Start an Erlang shell and start the webtool:
somebody@somehost> erl
Erlang R15B02 (erts-5.9.2) [source] [smp:2:2] [async-threads:0] [kernel-poll:false]
Eshell V5.9.2 (abort with ^G)
1> webtool:start().
WebTool is available at http://localhost:8888/
Or http://127.0.0.1:8888/
{ok,<0.35.0>}
2>
Navigate to the address given above with your browser, and click WebTool -> Start Tools -> CrashDumpViewer -> Start, then CrashDumpViewer -> Load Crashdump.
Look for the Slogan in General Information. It's the summarized reason of crashing.
Look for processes with a state other than Waiting. Those processes are doing something while the Erlang VM crashed, they are probably the sources.
Upvotes: 4