Luis Lavieri
Luis Lavieri

Reputation: 4129

Infinite loop in Erlang process

I'm very new to Erlang and tried to implement a simple class that has some methods to simulate a database. insert() just inserts a key -> value in the process map, and retrieve() just returns the value from the map. However, I am getting stuck in the loop(). What am I doing wrong?

-module(db).
-export([start/0,stop/0,retrieve/1,insert/2]).

start() ->
    register(db, spawn(fun() ->
              loop() 
          end)
        ),
    {started}.


insert(Key, Value) ->
    rpc({insert, Key, Value}).

retrieve(Key) ->
    rpc({retrieve, Key}).

stop() ->
    rpc({stop}).

rpc(Request) ->
    db ! {self(), Request},
    receive
    {db, Reply} ->
        Reply
    end.

loop() ->
    receive
    {rpc, {insert, Key, Value}} ->
        put(Key, Value),
        rpc ! {db, done},
        loop();
    {rpc, {retrieve, Key}} ->
        Val = get(Key),
        rpc ! {db, Val},
        loop();
    {rpc, {stop}} ->
        exit(db,ok),
        rpc ! {db, stopped}    
    end.

So, after compiling:

I first call db:start(). and then when trying db:insert("A", 1)., it gets stucked.

Thank you

Upvotes: 0

Views: 2050

Answers (2)

zxq9
zxq9

Reputation: 13164

Let's walk through this a bit more carefully. What do you mean by "rpc"? "Remote Procedure Call" -- sure. But everything in Erlang is an rpc, so we tend not to use that term. Instead we distinguish between synchronous messages (where the caller blocks, waiting on a response) and aynchronous messages (where the caller just fires off a message and runs off without a care in the world). We tend to use the term "call" for a synch message and "cast" for an asynch message.

We can write that easily, as a call looks a lot like your rpc above, with the added idiom in Erlang of adding a unique reference value to tag the message and monitoring the process we sent a message to just in case it crashes (so we don't get left hanging, waiting for a response that will never come... which we'll touch on in your code in a bit):

% Synchronous handler
call(Proc, Request) ->
    Ref = monitor(process, Proc),
    Proc ! {self(), Ref, Request},
    receive
        {Ref, Res} ->
            demonitor(Ref, [flush]),
            Res;
        {'DOWN', Ref, process, Proc, Reason} ->
            {fail, Reason}
    after 1000 ->
        demonitor(Ref, [flush]),
        {fail, timeout}
    end.

Cast is a bit easier:

cast(Proc, Message) ->
    Proc ! Message,
    ok.

The definition of call above means that the process we are sending to will receive a message of the form {SenderPID, Reference, Message}. Note that this is different than {sender, reference, message}, as lower-case values are atoms, meaning they are their own values.

When we receive messages we are matching on the shape and values of the message received. That means if I have

receive
    {number, X} ->
        do_stuff(X)
end

in my code and the process sitting in that receive get a message {blah, 25} it will not match. If it receives another message {number, 26} then it will match, that receive will call do_stuff/1 and the process will continue on. (These two things -- the difference between atoms and Variables and the way matching in receive works -- is why your code is hanging.) The initial message, {blah, 25} will still be in the mailbox, though, at the front of the queue, so the next receive has a chance to match on it. This property of mailboxes is immensely useful sometimes.

But what does a catch-all look like?

Above you are expecting three kinds of messages:

  • {insert, Key, Value}
  • {retrieve, Key}
  • stop

You dressed them up differently, but that's the business end of what you are trying to do. Running the insert message through the call/2 function I wrote above it would wind up looking like this: {From, Ref, {insert, Key, Value}}. So if we expect any response from the process's receive loop we will need to match on that exact form. How do we catch unexpected messages or badly formed ones? At the end of the receive clause we can put a single naked variable to match on anything else:

loop(State) ->
    receive
        {From, Ref, {insert, Key, Value}} ->
            NewState = insert(Key, Value, State),
            From ! {Ref, ok},
            loop(NewState);
        {From, Ref, {retrieve, Key}} ->
            Value = retrieve(Key, State),
            From ! {Ref, {ok, Value}},
            loop(State);
        {From, Ref, stop} ->
            ok = io:format("~tp: ~tp told me to stop!~n", [self(), From]),
            From ! {Ref, shutting_down},
            exit(normal)
        Unexpected ->
            ok = io:format("~tp: Received unexpected message: ~tp~n",
                           [self(), Unexpected]),
            loop(State)
    end.

You will notice that I am not using the process dictionary. DO NOT USE THE PROCESS DICTIONARY. This isn't what it is for. You'll overwrite something important. Or drop something important. Or... bleh, just don't do it. Use a dict or map or gb_tree or whatever instead, and pass it through as the process' State variable. This will become a very natural thing for you once you start writing OTP code later on.

Toy around with these things a bit and you will soon be happily spamming your processes to death.

Upvotes: 1

Paulo Suassuna
Paulo Suassuna

Reputation: 131

The problem is in loop/0 function. You're using rpc atom to pattern match the messages received ({rpc, {insert, Key, Value}}), but, as you can see on rpc/1 function, you always send messages with the format {self(), Request} to db process.

self() function returns a PID in the format <X.Y.Z>, which will never match against the atom rpc

For example, let's say you're trying to insert some data using the function insert/2 and self() would return the PID <0.36.0>. When rpc/1 sends the message, on the line db ! {self(), {insert, Key, Value}}, loop/0 will receive {<0.36.0>, {insert, Key, Value}} message, which will never match against {rpc, {insert, Key, Value}}, because rpc is an atom.

The solution is to change rpc atom to a variable, like this:

loop() ->
receive
{Rpc, {insert, Key, Value}} ->
    put(Key, Value),
    Rpc ! {db, done},
    loop();
{Rpc, {retrieve, Key}} ->
    Val = get(Key),
    Rpc ! {db, Val},
    loop();
{Rpc, {stop}} ->
    Rpc ! {db, stopped},
    exit(whereis(db),ok)

end.

Erlang variables start with capital letters, that's why I used Rpc, instead of rpc.

P.S.: Actually, you had two other problems:

  1. In the last part of loop/0, where you handle stop message, you call exit(db, ok) before you actually answer to rpc. In that case, you'd never receive the {db, stopped} message back from db process, which would be dead by that time. That's why I've changed the order, putting the exit/2 call after Rpc ! {db, stopped}.
  2. When you call exit/2, you were passing db, which is an atom, as the first argument, but exit/2 function expects an PID as first argument, which would raise a badarg error. That's why I've changed it to exit(whereis(db), ok).

Upvotes: 4

Related Questions