Rodion Gorkovenko
Rodion Gorkovenko

Reputation: 2852

What is Erlang way to collect/aggregate incoming data?

Trying to learn a bit of Erlang I soon have come to question of storing "global" data. I understand that having "mutable" variables seems to be not exactly Erlang-ish approach.

Simple example of calculating something like moving average of numbers read from console was comparatively easy to do with tail recursion and passing variable on each next iteration:

-module(movavg).
-export([start/0]).

start()->
    runForever(0.0).

runForever(Avg)->
    {ok, [X]} = io:fread("", "~f"),
    Avg2 = Avg - Avg / 5 + X / 5,
    io:format("avg=~p~n", [Avg2]),
    runForever(Avg2).

But as I have come with more industrial (particularly micro-services) background, I now wonder how this should work in "asynchronous" style, typically with us building app as a http-server and accepting requests.

I've found examples on starting tiny http server with Erlang which seems quite nice, like one mentioned here: How to write a simple webserver in Erlang?

But I'm not sure how I should handle data between requests. E.g. if I just want to sum numbers coming with requests or something like this...

I believe one approach is to use ets tables, but not sure it is correct idea. And surely this is not "immutable" storage... :)

Could you enlighten me please?

Upvotes: 1

Views: 369

Answers (3)

Mathieu Kerjouan
Mathieu Kerjouan

Reputation: 525

You can use ets (memory cache), dets (disk cache) or mnesia (relational database based on ets and dets).

If you want to use ets or dets, don't forget to start it linked with a gen_fsm, gen_statem or gen_server and make an api to query it. You can do something like that:

-behaviour(gen_server).
-export([init/1]).
-export([handle_call/3]).

init(_) ->
  Ets = ets:new(?MODULE, []),
  {ok, Ets}.

% forward request to your ets table
% lookup for a key:
handle_call({lookup, Data}, From, Ets) ->
  Response = ets:lookup(Data, Ets),
  {reply, Response, Ets};
% add new data:
handle_call({insert, Data}, From, Ets) ->
  Response = ets:insert(Data, Ets),
  {reply, Response, Ets}.

% Rest of you code...

If ETS or DETS doesn't match your need, you can create your own shared data-structure based on available one (dict, orddict, gb_sets, gb_trees, queue, digraph, sets, ordsets...), around standard behaviors. An example based on dict data-structure with gen_server behavior:

-module(gen_dict).
-behaviour(gen_server).
-export([start/0, start/1]).
-export([start_link/0, start_link/1]).
-export([init/1, terminate/2, code_change/3]).
-export([handle_call/3, handle_cast/2, handle_info/2]).
-export([add/2, add/3]).
-export([delete/1, delete/2]).
-export([to_list/0, to_list/1]).

% start without linking new process
start() -> 
  start([]).
start(Args) -> 
  gen_server:start({local, ?MODULE}, ?MODULE, Args, []).

% start with link on new process
start_link() -> 
  start_link([]).
start_link(Args) -> 
  gen_server:start_link({local, ?MODULE}, ?MODULE, Args, []).

% return by default a dict as state
init(_Args) -> {ok, dict:new()}.
code_change(_,_,_) -> ok.
terminate(_,_) -> ok.

% call request, used if we want to extract data from
% current state
handle_call(size, _From, State) -> 
  {reply, dict:size(State), State};
handle_call(list, _From, State) -> 
  {reply, dict:to_list(State), State};
handle_call({find, Pattern}, _From, State) -> 
  {reply, dict:find(Pattern, State)};
handle_call({map, Fun}, _From, State) -> 
  {reply, dict:map(Fun, State), State};
handle_call({fold, Fun, Acc}, _From, State) -> 
  {reply, dict:fold(Fun, Acc, State), State};
handle_call(_Request, _From, State) -> 
  {reply, bad_call, State}.

% cast request, used if we don't want return
% from our state
handle_cast({add, Key, Value}, State) -> 
  {noreply, dict:append(Key, Value, State)};
handle_cast({update, Key, Fun}, State) -> 
  {noreply, dict:update(Key, Fun, State)};
handle_cast({delete, Key}, State) -> 
  {noreply, dict:erase(Key, State)};
handle_cast({merge, Fun, Dict}, State) -> 
  {noreply, dict:merge(Fun, Dict, State)};
handle_cast(_Request, State) -> 
  {noreply, State}.

% info request, currently do nothing
handle_info(_Request, State) -> 
  {noreply, State}.

% API
% add a new item based on key/value
-spec add(term(), term()) -> ok.
add(Key, Value) -> 
  add(?MODULE, Key, Value).
-spec add(pid()|atom(), term(), term()) -> ok.
add(Server, Key, Value) -> 
  gen_server:cast(Server, {add, Key, Value}).

% delete a key
-spec delete(term()) -> ok.
delete(Key) -> 
  delete(?MODULE, Key).
-spec delete(pid()|atom(), term()) -> ok.
delete(Server, Key) -> 
  gen_server:cast(Server, {delete, Key}).

% extract state as list
-spec to_list() -> list().
to_list() -> 
  to_list(?MODULE).
-spec to_list(pid()|atom()) -> list().
to_list(Server) -> 
  gen_server:call(Server, list).

You can call this code like that:

% compile our code
c(gen_dict).
% start your process
{ok, Process} = gen_dict:start().
% add a new value
gen_dict:add(key, value).
% add another value
gen_dict:add(key2, value2).
% extract as list
List = gen_dict:list().

If you are a bit familiar with Erlang concepts and behaviors, you can make lot of interesting things, like allowing only some process to share a specific structure, or convert one structure to another only with some well crafted processes.

Upvotes: 2

Roman Rabinovich
Roman Rabinovich

Reputation: 918

Like Tianpo said, you can use a process to store data in its state, but from what I've read ETS tables work much faster.

If you need transaction type locking consider wrapping all access to ETS tables through a single process, or using Mnesia or process states as ETS don't have much in that area.

Upvotes: 0

Tianpo Gao
Tianpo Gao

Reputation: 71

You can use a named gen_server to hold a sate which contains shared data. Then u can use gen_server:call to modify the state inside the gen_server.

Upvotes: 2

Related Questions