Reputation: 17650
I have several, similar map/reduce jobs with divergent counter names and different getCounter()
conventions.
Is there an effective, idiomatic hadoop way for managing the uniform use of counter logging in large map/reduce applications ?
I think it is a somewhat scary idea that different map/reduce jobs can make their own counters. Is there a way you can disable this, so that all counters have to be created from a single resource? I think this would improve the quality of output for some of my classes.
Any other techniques for analyzing or managing all counters in an application would be appreciated...
Upvotes: 0
Views: 1527
Reputation: 33543
The following code is in the Counters.java. Note that this code is in the 20.203, 20.204 and 20.205 (now called 1.0) releases. Also note that some of the parameters are configurable and some are not.
/** limit on the size of the name of the group **/
private static final int GROUP_NAME_LIMIT = 128;
/** limit on the size of the counter name **/
private static final int COUNTER_NAME_LIMIT = 64;
private static final JobConf conf = new JobConf();
/** limit on counters **/
public static int MAX_COUNTER_LIMIT =
conf.getInt("mapreduce.job.counters.limit", 120);
/** the max groups allowed **/
static final int MAX_GROUP_LIMIT = 50;
In trunk and 0.23 release the below code is there in the MRJobConfig.java. Note that the parameters are configurable.
public static final String COUNTERS_MAX_KEY = "mapreduce.job.counters.max";
public static final int COUNTERS_MAX_DEFAULT = 120;
public static final String COUNTER_GROUP_NAME_MAX_KEY = "mapreduce.job.counters.group.name.max";
public static final int COUNTER_GROUP_NAME_MAX_DEFAULT = 128;
public static final String COUNTER_NAME_MAX_KEY = "mapreduce.job.counters.counter.name.max";
public static final int COUNTER_NAME_MAX_DEFAULT = 64;
public static final String COUNTER_GROUPS_MAX_KEY = "mapreduce.job.counters.groups.max";
public static final int COUNTER_GROUPS_MAX_DEFAULT = 50;
You might be interested in MAPREDUCE-3520 and this mail. I am planning to work on MAPREDUCE-3520, but not getting time :)
Upvotes: 4