Reputation: 130
I want to execute a mapreduce query, in erlang, that contains two map phases such that the Map2 function takes the result of the Map1 function as input. Is it possible and if, what must be the return value of each map phase
I have run a test mapred query using two simple map functions, each one returns the input object (in a list). but by runnin the query I get a badmatch error
Map1 = fun(O,_,_) -> [O] end.
Map2 = fun(O, _,_) -> [O] end.
C:mapred_bucket(<<"b7bc1418-198d-44a3-8835-8aa9cb416d5b">>, [{map, {qfun, Map1}, none, false}, {map, {qfun, Map2}, none, true}]).
{{badmatch,{r_object,<<"b7bc1418-198d-44a3-8835-8aa9cb416d5b">>,
<<255,230,193,167,254,7,246,64,154,190,36,236,32,232,189,
169,161,124,23,86>>,
[{r_content,{dict,2,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],...},
{{[],[],[],[],[],[],[],[],[],...}}},
<<"12d33872-4c92-4da5-9d16-5036a8059253">>}],
[{<<5,215,86,61>>,{1,63487018636}}],
{dict,1,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],...},
{{[],[],[],[],[],[],[],[],[],[],...}}},
undefined}},
[{riak_kv_map_phase,build_input,2},
{riak_kv_map_phase,'-handle_input/3-lc$^0/1-0-',2},
{riak_kv_map_phase,handle_input,3},
{luke_phase,executing,2},
{gen_fsm,handle_msg,7},
{proc_lib,init_p_do_apply,3}]}
I'm using riak_search-0.14.2
Erlang R14B03 (erts-5.8.4)
thank you!
Upvotes: 2
Views: 778
Reputation: 426
You have to return {Bucket, Key} or {{Bucket, Key}, KeyData} from first map function.
Like this:
Map1 = fun(O,_,_) -> [{riak_object:bucket(O), riak_object:key(O)}] end.
Map2 = fun(O, _,_) -> [O] end.
C:mapred_bucket(<<"b7bc1418-198d-44a3-8835-8aa9cb416d5b">>, [{map, {qfun, Map1}, none, false}, {map, {qfun, Map2}, none, true}]).
Upvotes: 4
Reputation: 8202
I'm not sure what the signature of the Map method is in Erlang, as I've only done map/reduce in Javascript, but I'll try to help.
In order to chain the map phases, only the last map function needs to return a list of objects in Riak. Every other map function above it needs to return a tuple containing the bucket name and the key of the value passed in.
In Javascript, I've accomplished this like so:
function map_function(value, keydata, arg) {
//filtering stuff here
if(arg.last) {
data["key"] = value.key;
return [data];
}
else {
return [[value.bucket, value.key]];
}
//this is in the case the filter returns true; if the filter returns false, return an empty tuple
}
Hope this helps.
Upvotes: 5