Erlang mnesia database access

Question

I have designed a mnesia database with 5 different tables. The idea is to simulate queries from many nodes (computers) not just one, at the moment from the terminal i can execute a query, but I just need help on how i can make it such that I am requesting information from many computers. I am testing for scalability and want to investigate the performance of mnesia vs other databases. Any idea will be highly appreciated.

Muzaaya Joshua · Accepted Answer

The best way to test mnesia is by using an intensive threaded job both on the local Erlang Node where mnesia is running and on the remote nodes. Usually, you want to have remote nodes using RPC calls in which reads and writes are being executed on mnesia tables. Of-course, with high concurrency comes a trade off; speed of transactions will reduce, many may be retried as the locks may be many at a given time; But mnesia will ensure that all processes receive an {atomic,ok} for each transactional call they make.

The Concept
I propose that we have a non-blocking overload with both Writes and reads in directed to each mnesia table by as many processes as possible. We measure the time difference between the call to the write function and the time it takes for our massive mnesia subscriber to get a Write Event. These Events are sent by mnesia every after a successful Transaction and so we need not interrupt the working/overloading processes but rather let a "strong" mnesia subscriber to wait for asynchronous events reporting successful deletes and writes as soon as they occur.
The technique here is that we take the time stamp at the point just before calling a write function and then we note down the record key, the write CALL timestamp. Then our mnesia subscriber would note down the record key, the write/read EVENT timestamp. Then the time difference between these two time stamps (lets call it: CALL-to-EVENT Time) would give us a rough idea of how loaded, or how efficient we are going. As locks increase with Concurrency, we should be registering increasing CALL-to-EVENT Time parameter. Processes doing writes (unlimited) will do so concurrently while those doing reads will also continue to do so without interruptions. We will choose the number of processes for each operation but lets first lay ground for the entire test case.
All the above Concept is for Local operations (processes running on the same Node as Mnesia)

--> Simulating Many Nodes
Well, i have personally not simulated Nodes in Erlang, i have always worked with real Erlang Nodes on the Same box or on several different machines in a networked environment. However, i advise that you look closely on this module: http://www.erlang.org/doc/man/slave.html, concentrate more on this one here: http://www.erlang.org/doc/man/ct_slave.html, and look at the following links as they talk about creating, simulating and controlling many nodes under another parent node (http://www.erlang.org/doc/man/pool.html, Erlang: starting slave node,https://support.process-one.net/doc/display/ERL/Starting+a+set+of+Erlang+cluster+nodes,http://www.berabera.info/oldblog/lenglet/howtos/erlangkerberosremctl/index.html). I will not dive into a jungle of Erlang Nodes here bacause it also another complicated topic but i will concentrate on tests on the same node running mnesia. I have come up with the above mnesia test concept and here, lets start implementing it.

Now, First of all, you need to make a test plan for each table (separate). This should include both writes and reads. Then you need to decide whether you want to do dirty operations or transactional operations on the tables. You need to test speed of traversing a mnesia table in relation to its size. Lets take an example of a simple mnesia table

-record(key_value,{key,value,instanceId,pid}).

We would want to have a general function for writing into our table, here below:

write(Record)->
    %% Use mnesia:activity/4 to test several activity
    %% contexts (and if your table is fragmented)
    %% like the commented code below
    %%
    %%  mnesia:activity(
    %%      transaction, %% sync_transaction | async_dirty | ets | sync_dirty
    %%      fun(Y) -> mnesia:write(Y) end,
    %%      [Record],
    %%      mnesia_frag
    %%  )
    mnesia:transaction(fun() -> ok = mnesia:write(Record) end).

And for our reads, we will have:

read(Key)->
    %% Use mnesia:activity/4 to test several activity
    %% contexts (and if your table is fragmented)
    %% like the commented code below
    %%
    %%  mnesia:activity(
    %%      transaction, %% sync_transaction | async_dirty| ets | sync_dirty
    %%      fun(Y) -> mnesia:read({key_value,Y}) end,
    %%      [Key],
    %%      mnesia_frag
    %%  )
    mnesia:transaction(fun() -> mnesia:read({key_value,Key}) end).

Now, we want to write very many records into our small table. We need a key generator. This key generator will be our own pseudo-random string generator. However, we need our generator to tell us the instant it generates a key so we record it. We want to see how long it takes to write a generated key. Lets put it down like this:

timestamp()-> erlang:now().

str(XX)-> integer_to_list(XX).

generate_instance_id()->
    random:seed(now()),
    guid() ++ str(crypto:rand_uniform(1, 65536 * 65536)) ++ str(erlang:phash2({self(),make_ref(),time()})).
 
guid()->
    random:seed(now()),
    MD5 = erlang:md5(term_to_binary({self(),time(),node(), now(), make_ref()})),
    MD5List = binary_to_list(MD5),
    F = fun(N) -> f("~2.16.0B", [N]) end,
    L = lists:flatten([F(N) || N <- MD5List]),
  %% tell our massive mnesia subscriber about this generation
    InstanceId = generate_instance_id(),
    mnesia_subscriber ! {self(),{key,write,L,timestamp(),InstanceId}},
    {L,InstanceId}.

To make very many concurrent writes, we need a function which will be executed by many processes we will spawn. In this function, its desirable NOT to put any blocking functions such as sleep/1 usually implemented as sleep(T)-> receive after T -> true end.. Such a function would make a processes execution to hang for the specified milliseconds. mnesia_tm does the lock control, retry, blocking, e.t.c. on behalf of the processes to avoid dead locks. Lets say, we want each processes to write an unlimited amount of records. Here is our function:

-define(NO_OF_PROCESSES,20).

start_write_jobs()->
    [spawn(?MODULE,generate_and_write,[]) || _ <- lists:seq(1,?NO_OF_PROCESSES)],
    ok.

generate_and_write()-> 
    %% remember that in the function ?MODULE:guid/0,
    %% we inform our mnesia_subscriber about our generated key
    %% together with the timestamp of the generation just before 
    %% a write is made.
    %% The subscriber will note this down in an ETS Table and then
    %% wait for mnesia Event about the write operation. Then it will
    %% take the event time stamp and calculate the time difference
    %% From there we can make judgement on performance. 
    %% In this case, we make the processes make unlimited writes 
    %% into our mnesia tables. Our subscriber will trap the events as soon as
    %% a successful write is made in mnesia
    %% For all keys we just write a Zero as its value

    {Key,Instance} = guid(),
    write(#key_value{key = Key,value = 0,instanceId = Instance,pid = self()}),
    generate_and_write().

Likewise, lets see how the read jobs will be done. We will have a Key provider, this Key provider keeps rotating around the mnesia table picking only keys, up and down the table it will keep rotating. Here is its code:

first()-> mnesia:dirty_first(key_value).

next(FromKey)-> mnesia:dirty_next(key_value,FromKey).

start_key_picker()-> register(key_picker,spawn(fun() -> key_picker() end)).

key_picker()->
    try ?MODULE:first() of      
        '$end_of_table' -> 
            io:format("
	Table is empty, my dear !~n",[]),
            %% lets throw something there to start with
            ?MODULE:write(#key_value{key = guid(),value = 0}),
            key_picker();
        Key -> wait_key_reqs(Key)
    catch
        EXIT:REASON -> 
            error_logger:error_info(["Key Picker dies",{EXIT,REASON}]),
            exit({EXIT,REASON})
    end.

wait_key_reqs('$end_of_table')->
receive
    {From,<<"get_key">>} -> 
        Key = ?MODULE:first(),
        From ! {self(),Key},
        wait_key_reqs(?MODULE:next(Key));
    {_,<<"stop">>} -> exit(normal)
end;
wait_key_reqs(Key)->
receive
    {From,<<"get_key">>} -> 
        From ! {self(),Key},
        NextKey = ?MODULE:next(Key),
        wait_key_reqs(NextKey);
    {_,<<"stop">>} -> exit(normal)
end.

key_picker_rpc(Command)->
    try erlang:send(key_picker,{self(),Command}) of
        _ -> 
            receive
                {_,Reply} -> Reply
            after timer:seconds(60) -> 
                %% key_picker hang, or too busy
                erlang:throw({key_picker,hanged})
            end
    catch
        _:_ -> 
            %% key_picker dead
            start_key_picker(),
            sleep(timer:seconds(5)),
            key_picker_rpc(Command)
    end.

%% Now, this is where the reader processes will be
%% accessing keys. It will appear to them as though
%% its random, because its one process doing the 
%% traversal. It will all be a game of chance
%% depending on the scheduler's choice
%% he who will have the next read chance, will
%% win ! okay, lets get going below :)

get_key()-> 
    Key = key_picker_rpc(<<"get_key">>),

    %% lets report to our "massive" mnesia subscriber
    %% about a read which is about to happen
    %% together with a time stamp.
    Instance = generate_instance_id(),
    mnesia_subscriber ! {self(),{key,read,Key,timestamp(),Instance}},
    {Key,Instance}.

Wow !!! Now we need to create the function where we will start all the readers.

-define(NO_OF_READERS,10).

start_read_jobs()->
    [spawn(?MODULE,constant_reader,[]) || _ <- lists:seq(1,?NO_OF_READERS)],
    ok.

constant_reader()->
    {Key,InstanceId} = ?MODULE:get_key(),
    Record = ?MODULE:read(Key),
    %% Tell mnesia_subscriber that a read has been done so it creates timestamp
    mnesia:report_event({read_success,Record,self(),InstanceId}),   
    constant_reader().

Now, the biggest part; mnesia_subscriber !!! This is a simple process that will subscribe to simple events. Get mnesia events documentation from the mnesia users guide. Here is the mnesia subscriber

-record(read_instance,{
        instance_id,
        before_read_time,
        after_read_time,
        read_time       %% after_read_time - before_read_time

    }).

-record(write_instance,{
        instance_id,
        before_write_time,
        after_write_time,
        write_time          %% after_write_time - before_write_time
    }).

-record(benchmark,{
        id,         %% {pid(),Key}
        read_instances = [],
        write_instances = []
    }).

subscriber()->
    mnesia:subscribe({table,key_value, simple}),

    %% lets also subscribe for system
    %% events because events passing through
    %% mnesia:event/1 will go via
    %% system events. 

    mnesia:subscribe(system),
    wait_events().

-include_lib("stdlib/include/qlc.hrl").

wait_events()->
receive
    {From,{key,write,Key,TimeStamp,InstanceId}} -> 
        %% A process is just about to call
        %% mnesia:write/1 and so we note this down
        Fun = fun() -> 
                case qlc:e(qlc:q([X || X <- mnesia:table(benchmark),X#benchmark.id == {From,Key}])) of
                    [] -> 
                        ok = mnesia:write(#benchmark{
                                id = {From,Key},
                                write_instances = [
                                        #write_instance{
                                            instance_id = InstanceId,
                                            before_write_time = TimeStamp                                               
                                        }]
                                }),
                                ok;
                    [Here] -> 
                        WIs = Here#benchmark.write_instances,
                        NewInstance = #write_instance{
                                        instance_id = InstanceId,
                                        before_write_time = TimeStamp                                               
                                    },
                        ok = mnesia:write(Here#benchmark{write_instances = [NewInstance|WIs]}),
                        ok                          
                end
            end,
        mnesia:transaction(Fun),
        wait_events();      
    {mnesia_table_event,{write,#key_value{key = Key,instanceId = I,pid = From},_ActivityId}} ->
        %% A process has successfully made a write. So we look it up and 
        %% get timeStamp difference, and finish bench marking that write
        WriteTimeStamp = timestamp(),
        F = fun()->
                [Here] = mnesia:read({benchmark,{From,Key}}),
                WIs = Here#benchmark.write_instances,
                {_,WriteInstance} = lists:keysearch(I,2,WIs),
                BeforeTmStmp = WriteInstance#write_instance.before_write_time,
                NewWI = WriteInstance#write_instance{
                            after_write_time = WriteTimeStamp,
                            write_time = time_diff(WriteTimeStamp,BeforeTmStmp)
                        },
                ok = mnesia:write(Here#benchmark{write_instances = [NewWI|lists:keydelete(I,2,WIs)]}),
                ok
            end,
        mnesia:transaction(F),
        wait_events();      
    {From,{key,read,Key,TimeStamp,InstanceId}} ->
        %% A process is just about to do a read
        %% using mnesia:read/1 and so we note this down
        Fun = fun()-> 
                case qlc:e(qlc:q([X || X <- mnesia:table(benchmark),X#benchmark.id == {From,Key}])) of
                    [] -> 
                        ok = mnesia:write(#benchmark{
                                id = {From,Key},
                                read_instances = [
                                        #read_instance{
                                            instance_id = InstanceId,
                                            before_read_time = TimeStamp                                                
                                        }]
                                }),
                                ok;
                    [Here] -> 
                        RIs = Here#benchmark.read_instances,
                        NewInstance = #read_instance{
                                        instance_id = InstanceId,
                                        before_read_time = TimeStamp                                            
                                    },
                        ok = mnesia:write(Here#benchmark{read_instances = [NewInstance|RIs]}),
                        ok
                end
            end,
        mnesia:transaction(Fun),
        wait_events();
    {mnesia_system_event,{mnesia_user,{read_success,#key_value{key = Key},From,I}}} ->
        %% A process has successfully made a read. So we look it up and 
        %% get timeStamp difference, and finish bench marking that read
        ReadTimeStamp = timestamp(),
        F = fun()->
                [Here] = mnesia:read({benchmark,{From,Key}}),
                RIs = Here#benchmark.read_instances,
                {_,ReadInstance} = lists:keysearch(I,2,RIs),
                BeforeTmStmp = ReadInstance#read_instance.before_read_time,
                NewRI = ReadInstance#read_instance{
                            after_read_time = ReadTimeStamp,
                            read_time = time_diff(ReadTimeStamp,BeforeTmStmp)
                        },
                ok = mnesia:write(Here#benchmark{read_instances = [NewRI|lists:keydelete(I,2,RIs)]}),
                ok
            end,
        mnesia:transaction(F),
        wait_events();  
    _ -> wait_events();
end.

time_diff({A2,B2,C2} = _After,{A1,B1,C1} = _Before)->        
    {A2 - A1,B2 - B1,C2 - C1}.

Alright ! That was huge :) So we are done with the subscriber. We need to put the code that will crown it all together and run the necessary tests.

install()->
    mnesia:stop().
    mnesia:delete_schema([node()]),
    mnesia:create_schema([node()]),
    mnesia:start(),
    {atomic,ok} = mnesia:create_table(key_value,[
        {attributes,record_info(fields,key_value)},
        {disc_copies,[node()]}

        ]),
    {atomic,ok} = mnesia:create_table(benchmark,[
            {attributes,record_info(fields,benchmark)},
            {disc_copies,[node()]}
        ]),
    mnesia:stop(),
    ok.
 
start()->
    mnesia:start(),
    ok = mnesia:wait_for_tables([key_value,benchmark],timer:seconds(120)),
    %% boot up our subscriber
    register(mnesia_subscriber,spawn(?MODULE,subscriber,[])),
    start_write_jobs(),
    start_key_picker(),
    start_read_jobs(),
    ok.

Now, with proper analysis of the benchmark table records, you will get record of average read times, average write times e.t.c. You draw a graph of these times against increasing number of processes. As we increase the number of processes, you will discover that the read and write times increase . Get the code, read it and make use of it. You may not use all of it but am sure you could pick up new concepts from there as others send in there solutions. Using mnesia events is the best way to test mnesia reads and writes without blocking the processes doing the actual writing or reading. In the example above, the reading and writing processes are out of any control, infact, they will run forever until you terminate the VM. You can traverse the benchmark table with a good formulae to make use of the read and write times per read or write instance and then you would calculate averages, variations e.t.c.

Testing from Remote Computers, Simulating Nodes, benchmarking against other DBMS may not be as relevant simply because of many reasons. The concepts, motivations and goals of Mnesia are very different from several types of existing Database Types like: document oriented DBs, RDBMS, Object-Oriented DBs e.t.c. Infact, mnesia out to be compared with a Database such as this one. Its a Distributed DBMs with a Hybrid/Unstructured kinda Data Structures which belong to the Language Erlang. Benchmarking Mnesia against another type of Database may not be right because its purpose is very different from many and its tight coupling with Erlang/OTP. However, a knowledge of how mnesia works, transaction contexts, indexing, concurrency, distribution can be key to a good Database Design. Mnesia can store a very Complex Data Structure. Remember, the more complex a Data Structure is with nested information, the more work required to unpack it and extract the information you need at run-time, which means more CPU Cycles and memory. Some times, normalization with mnesia may just result in poor performance and so the implementation of its concepts are far away from other Database.
Its good you are interested in Mnesia performance across several machines (distributed), however, the performance is as good as Distributed Erlang is. The great thing is that atomicity is ensured for every transaction. Still concurrent requests from remote nodes can be sent via RPC Calls. Remember that if you have multiple replicas of mnesia on different machines, processes running on each node will write on that very node, then mnesia will carry on from there with its replication. Mnesia is very fast at replication, unless a network is really doing bad and/or the nodes are not connected, or network is partitioned at runtime.
Mnesia ensures Consistency and Atomicity of CRUD Operations. For this reason, replicated mnesia Databases highly depend on the network availability for better performance. As long as the Erlang Nodes remain connected, the two or more Mnesia Nodes will always have the same data. Reads on one Node will ensure that you get the most recent information. Problems arise when a disconnection occurs and each node registers thet other as though its down. More information on mnesia's performance can be found by following the following links

http://igorrs.blogspot.com/2010/05/mnesia-one-year-later.html
http://igorrs.blogspot.com/2010/05/mnesia-one-year-later-part-2.html
http://igorrs.blogspot.com/2010/05/mnesia-one-year-later-part-3.html
http://igorrs.blogspot.com/2009/11/consistent-hashing-for-mnesia-fragments.html

As a consequence, the concepts behind mnesia can only be compared with Ericsson's NDB Database found here: http://ww.dolphinics.no/papers/abstract/ericsson.html, but not with existing RDBMS, or Document Oriented Databases, e.t.c. Those are my thoughts :) lets wait for what others have to say.....

Erlang mnesia database access

Answers (2)

Related Questions