Reputation: 11
Map reduce job can't be debugged remotely in a distributed cluster because each map and reduce spawns their own JVM. What does it exactly mean ? Can't we attach debugger for each and every process in each node in the cluster involved in map reduce job ?
I've been reading so many articles and solutions, but not able to understand the problem behind debugging a map reduce job in a distributed cluster. Any help would be appreciated.
Thanks
Upvotes: 1
Views: 20
Reputation: 191738
You can debug only a single task at any given time., no debugger that I know of can create multiple sessions at once ; specifically each mapreduce task isn't able to be individually configured with JVM debug ports, so if it were possible, you would be have to know which nodemanager the jobs get started on, and ensure there's no port overlap on same hosts
If you really needed to remote debug, seems like you have poor unit test coverage to begin with and you probably shouldn't deploy said code into production anyway.
Upvotes: 1