Can't start process in erlang node

Question

I have two erlang nodes, node01 is 'vm01@192.168.146.128', node02 is 'vm02@192.168.146.128'. I want to start one process on node01 by using spawn(Node, Mod, Fun, Args) on node02, but I always get useless pid.

Node connection is ok:

(vm02@192.168.146.128)14> net_adm:ping('vm01@192.168.146.128').
pong

Module is in the path of node01 and node02:

(vm01@192.168.146.128)7> m(remote_process).
Module: remote_process
MD5: 99784aa56b4feb2f5feed49314940e50
Compiled: No compile time info available
Object file: /src/remote_process.beam
Compiler options:  []
Exports: 
         init/1
         module_info/0
         module_info/1
         start/0
ok

(vm02@192.168.146.128)20> m(remote_process).
Module: remote_process
MD5: 99784aa56b4feb2f5feed49314940e50
Compiled: No compile time info available
Object file: /src/remote_process.beam
Compiler options:  []
Exports: 
         init/1
         module_info/0
         module_info/1
         start/0
ok

However, the spawn is not successful:

(vm02@192.168.146.128)21> spawn('vm01@192.168.146.128', remote_process, start, []). 
I'm on node 'vm01@192.168.146.128'
<9981.89.0>
My pid is <9981.90.0>

(vm01@192.168.146.128)8> whereis(remote_process).
undefined

The process is able to run on local node:

(vm02@192.168.146.128)18> remote_process:start().
I'm on node 'vm02@192.168.146.128'
My pid is <0.108.0>
{ok,<0.108.0>}

(vm02@192.168.146.128)24> whereis(remote_process).
<0.115.0>

But it fails on remote node. Can anyone give me some idea?

Here is the source code remote_process.erl:

-module(remote_process).
-behaviour(supervisor).
-export([start/0, init/1]).

start() ->
    {ok, Pid} = supervisor:start_link({global, ?MODULE}, ?MODULE, []),
    {ok, Pid}.

init([]) ->
    io:format("I'm on node ~p~n", [node()]),
    io:format("My pid is ~p~n", [self()]),
    {ok, {{one_for_one, 1, 5}, []}}.

Pascal · Accepted Answer

You are using a global registration for your process, it is necessary for your purpose. The function to retrieve it is global:whereis_name(remote_process).

Edit : It works if

the 2 nodes are connected (check with nodes())
the process is registered with the global module
the process is still alive

if any of these conditions is not satisfied you will get undefined

Edit 2: start node 1 with : werl -sname p1 and type in the shell :

(p1@W7FRR00423L)1> c(remote_process).
{ok,remote_process}
(p1@W7FRR00423L)2> remote_process:start().
I'm on node p1@W7FRR00423L
My pid is <0.69.0>
{ok,<0.69.0>}
(p1@W7FRR00423L)3> global:whereis_name(remote_process).
<0.69.0>
(p1@W7FRR00423L)4>

then start a second node with werl - sname p2 and type in the shell (it is ok to connect the second node later, the global registration is "updated" when necessary):

(p2@W7FRR00423L)1> net_kernel:connect_node(p1@W7FRR00423L).
true
(p2@W7FRR00423L)2> nodes().
[p1@W7FRR00423L]
(p2@W7FRR00423L)3> global:whereis_name(remote_process).
<7080.69.0>
(p2@W7FRR00423L)4> 
(p2@W7FRR00423L)4>

Edit 3:

In your test you are spawning a process P1 on the remote node which executes the function remote_process:start/0.

This function calls supervisor:start_link/3 which basically spawns a new supervisor process P2 and links itself to it. after this, P1 has nothing to do anymore so it dies, causing the linked process P2 to die too and you get an undefined reply to the global:whereis_name call.

In my test, I start the process from the shell of the remote node; the shell does not die after I evaluate remote_process:start/0, so the supervisor process does not die and global:whereis_name find the requested pid.

If you want that the supervisor survive to the call, you need an intermediate process that will be spawned without link, so it will not die with its parent. I give you a small example based on your code:

-module(remote_process).
-behaviour(supervisor).
-export([start/0, init/1,local_spawn/0,remote_start/1]).

remote_start(Node) ->
    spawn(Node,?MODULE,local_spawn,[]).

local_spawn() ->
    % spawn without link so start_wait_stop will survive to
    % the death of local_spawn process
    spawn(fun start_wait_stop/0).

start_wait_stop() ->
    start(),
    receive
        stop -> ok
    end.

start() ->
    io:format("start (~p)~n",[self()]),
    {ok, Pid} = supervisor:start_link({global, ?MODULE}, ?MODULE, []),
    {ok, Pid}.

init([]) ->
    io:format("I'm on node ~p~n", [node()]),
    io:format("My pid is ~p~n", [self()]),
    {ok, {{one_for_one, 1, 5}, []}}.

in the shell you get in node 1

(p1@W7FRR00423L)1> net_kernel:connect_node(p2@W7FRR00423L).
true
(p1@W7FRR00423L)2> c(remote_process).
{ok,remote_process}
(p1@W7FRR00423L)3> global:whereis_name(remote_process).
undefined
(p1@W7FRR00423L)4> remote_process:remote_start(p2@W7FRR00423L).
<7080.68.0>
start (<7080.69.0>)
I'm on node p2@W7FRR00423L
My pid is <7080.70.0>
(p1@W7FRR00423L)5> global:whereis_name(remote_process).        
<7080.70.0>
(p1@W7FRR00423L)6> global:whereis_name(remote_process).
undefined

and in node 2

(p2@W7FRR00423L)1> global:registered_names(). % before step 4
[]
(p2@W7FRR00423L)2> global:registered_names(). % after step 4
[remote_process]
(p2@W7FRR00423L)3> rp(processes()).
[<0.0.0>,<0.1.0>,<0.4.0>,<0.30.0>,<0.31.0>,<0.33.0>,
 <0.34.0>,<0.35.0>,<0.36.0>,<0.37.0>,<0.38.0>,<0.39.0>,
 <0.40.0>,<0.41.0>,<0.42.0>,<0.43.0>,<0.44.0>,<0.45.0>,
 <0.46.0>,<0.47.0>,<0.48.0>,<0.49.0>,<0.50.0>,<0.51.0>,
 <0.52.0>,<0.53.0>,<0.54.0>,<0.55.0>,<0.56.0>,<0.57.0>,
 <0.58.0>,<0.62.0>,<0.64.0>,<0.69.0>,<0.70.0>]
ok
(p2@W7FRR00423L)4> pid(0,69,0) ! stop. % between steps 5 and 6
stop
(p2@W7FRR00423L)5> global:registered_names().
[]

Can't start process in erlang node

Answers (1)

Related Questions

Can&#39;t start process in erlang node

Answers (1)

Related Questions

Can't start process in erlang node