Mark
Mark

Reputation: 69920

Riak - Concurrent Erlang Map/Reduce jobs

I'm running Erlang Map/Reduce jobs on Riak.

When in the past I used Javascript M/R jobs, I had to tune the JS VM settings properly. At the time I found this conversation to be extremely useful: to http://riak-users.197444.n3.nabble.com/Follow-up-Riak-Map-Reduce-error-preflist-exhausted-td4024330.html

Now, because I'm not an Erlang developer, I wonder what are the main implications when running concurrent M/R jobs on Riak and if there's any VM settings to set (like I used to do with JS M/R).

Thanks

Upvotes: 0

Views: 636

Answers (2)

danechkin
danechkin

Reputation: 1306

Currently we found this riak mapred gotchas:

  • worker_limit_reached. This is happens when you have a lot of data arriving to mapred job and job's queue full
  • read with r=1. All your data inside mapreduce is read with r=1
  • no read repair. Mapreduce reads does not trigger read reapair
  • you may get already deleted data. Inside mapred check special header of object, which indicates that object is already deleted

p.s. this is about riak 1.2.1. Basho folks quickly resolve many issues, so it may be changed in near future.

Upvotes: 1

user425720
user425720

Reputation: 3598

Basically what happens here is that all phases of map/reduce query is performed by ErlangVM, not by Erlang+JS. Since the jobs are isolated in ErlangVM in separate processes, operations are not affected. Host-wise you have the same computational power, so it is also OK. Regarding ErlangVM parameters, many of them were tweaked to improve Riak operatinos and your query is good to go.

Upvotes: 0

Related Questions