Reputation: 8616
I want to implement a real-time score ranking(ordered). I hope I can get every player's score fast(key-value).
Here player_id is key and score is value.
I try to use ordered-set type ETS to store the list of all the player's score, But ordered set orders after the key not the value.
Does Erlang/OTP have some other data struct that can solve my problem?
Upvotes: 6
Views: 2419
Reputation: 83
There are three solutions:
ets ordered-set
RAM-only mnesia table with secondary index
nif
1) ordered-set, records in the ets table should be {score, player_id}, not {player_id, score}, so that they're sorted by score. To get the score for a player, just use match. Although match needs to scan the whole table, it's still pretty fast.
Profiling: assuming there are 10k players, insert 10k records into ets table, then ets:match_object(Table, {'_', PlayerID}). Each match takes 0.7 to 1.1ms, which is more than good enough in most cases. (CPU: i5 750)
2) mnesia table, make it RAM-only and use an secondary index for player_id:
mnesia:create_table(user, [{type, ordered_set}, {attributes, record_info(fields, user)}, {index, [playerid]}])
Average get time using mnesia:read in mnesia:transaction is 0.09ms. But, inserting 10k records is about 180 times slower than its ets counterpart (2820ms vs 15ms).
If neither ets nor mnesia satisfies your performance requirement, resorting to C with nif could be another choice. But personally I think over-optimizing is not worthy here unless it's really your bottleneck.
Upvotes: 3
Reputation: 14042
What I understand is that you need to maintain a list of pair (Key,Score) with which you want to perform:
I propose you to store your data into 2 different ets:
When you need to display scores, the ordered set is sufficient.
when you need to update a score you should use the ets to get the value of the previous score for Key, delete the record {PrevScore,Key} and insert {NewScore,Key} in the ordered set and simply update the first ets with new value.
I tested this solution on a 1 000 000 item list, the update of 1 score takes an average of 3µs on my laptop (windows XP, core i5, 2Go ram ,all disks full and many app running in background). the code I used :
note I used private tables and a single server to access them in order to guarantee the consistency of the 2 tables, so concurrent processes can access to the server (named score) without conflict, the requests will be executed in the order they arrive to the server. It could be possible to answer in priority to any get(N) request with 2 receive blocks.
here is the test result on my home PC (ubuntu 12.04, 8gb ddr, AMD phenom II X6)...
[edit] I modified the update/2 function in order to be synchronous, so the measure is now significant, and the result more understandable. It appears that for table smaller than 10000 records, the overhead of ets management and message passing is preponderant.
-module (score).
-export ([start/0]).
-export ([update/2,delete/1,get/1,stop/0]).
score ! {update,M,S,self()},
receive
ok -> ok
end.
delete(M) ->
score ! {delete,M}.
get(N) ->
score ! {getbest,N,self()},
receive
{ok,L} -> L
after 5000 ->
timeout
end.
stop() ->
score ! stop.
start() ->
P = spawn(fun() -> initscore() end),
register(score,P).
initscore() ->
ets:new(score,[ordered_set,private,named_table]),
ets:new(member,[set,private,named_table]),
loop().
loop() ->
receive
{getbest,N,Pid} when is_integer(N), N > 0 ->
Pid ! {ok,lists:reverse(getbest(N))},
loop();
{update,M,S,P} ->
update_member(M,S),
P ! ok,
loop();
{delete,M} ->
delete_member(M),
loop();
stop ->
ok
end.
update_member(M,S) ->
case ets:lookup(member,M) of
[] ->
ok;
[{M,S1}] ->
ets:delete(score,{S1,M})
end,
ets:insert(score,{{S,M}}),
ets:insert(member,{M,S}).
delete_member(M) ->
case ets:lookup(member,M) of
[] ->
ok;
[{M,S}] ->
ets:delete(score,{S,M}),
ets:delete(member,M)
end.
getbest(N) ->
K= ets:last(score),
getbest(N-1,K,[]).
getbest(_N,'$end_of_table',L) -> L;
getbest(0,{S,M},L) -> [{M,S}|L];
getbest(N,K={S,M},L) ->
K1 = ets:prev(score,K),
getbest(N-1,K1,[{M,S}|L]).
and the code for test:
-module (test_score).
-compile([export_all]).
init(N) ->
score:start(),
random:seed(erlang:now()),
init(N,10*N).
stop() ->
score:stop().
init(0,_) -> ok;
init(N,M) ->
score:update(N,random:uniform(M)),
init(N-1,M).
test_update(N,M) ->
test_update(N,M,0).
test_update(0,_,T) -> T;
test_update(N,M,T) -> test_update(N-1,M,T+update(random:uniform(M),random:uniform(10*M))).
update(K,V) ->
{R,_} = timer:tc(score,update,[K,V]),
R.
Upvotes: 6
Reputation: 3658
I wouldn't be picky about the way erlang chooses to order it's data: it optimizes itself or fast lookups. What you could do however, is read your ETS table in a list, and use a lists:sort/2
to sort the data.
List = ets:tab2list(EtsTable),
lists:sort(SortFun,List).
It'll still be fast: ETS tables and lists reside in memory, as long as you have enough of it. However, I'd dump the ordered_set, you'll lose the constant access time
This module is an interface to the Erlang built-in term storage BIFs. These provide the ability to store very large quantities of data in an Erlang runtime system, and to have constant access time to the data. (In the case of ordered_set, see below, access time is proportional to the logarithm of the number of objects stored).
And don't forget some form of disk-based backup, dets or mnesia (if the data is worth keeping).
Upvotes: 3