LTB
LTB

Reputation: 31

Using picoCLI to recursively parse configuration files?

Im answering this question below; I opened it because it is more general than what I originally asked in an earlier question, so it wouldn't really fit there. It took me quite some tinkering, so I thought I'd share the solution here.

My situation:

I use picoCLI to parse multiple configuration files that in turn can "include" other config files, to arbitrary depth. Unfortunately, for some of my options the order in which they are parsed does also matter.

In my application, there are "section" options like section=A:teacher that request section A and cause it to be processed (I'll leave out what that exactly means) for teachers, students or other groups. Among a number of other options, there is also one called configfile= that "includes" another option file. That situation can be described by a "tree" of configuration details:

# options given on actual command line ("root of tree")
    section=A:teacher
    configfile=cf-1.txt  # include options from cf-1.txt
        section=A:student # this indentation: options read from cf-1.txt
        section=B:principal
        configfile=cf-2.txt  # read options from cf-2.txt
            section=A:parent # this indentation: options read from cf-2.txt
            section=C:parent
        section=C:teacher  # back in cf-1.txt
    section=D:admin  # back to actual command line

I want this tree to be traversed depth-first, with "later" options overwriting "earlier" ones if they refer to the same section name: In the end, section A should get parent and C should get teacher.

For parsing configfile= options, I can't use picoCLI's @-syntax because these files are not necessarily in the "current" folder, so I want to control where the application looks for them. That's problem #1. It is solved by the parseConfigfile method listed below.

Unfortunately, picoCLI has a peculiar quirk when an option occurs multiple times in the same file (as section does with A, B and C): It does call the annotated setter method each time but with accumulating option values in the list parameter of that method. The first call only gets (A:student), the second (A:student,B:prof), the third (A:student,B:prof,C:teacher) etc.

I learned here that this behaviour is intended but for me it is problem #2 because the repeated evaluation of section=A:student messes up my later-options-overwrite-earlier-ones semantics: In the end, A is incorrectly configured for teacher. For many options (those with "one-dimensional" values), that's not a problem, but it is for section= and, somewhat ironically, also for configfile=.

Upvotes: 0

Views: 311

Answers (1)

LTB
LTB

Reputation: 31

Here's my solution; maybe it's useful to someone. The basic approach is to use a new CommandLine instance for each nested config file.

All code snippets below are from the annotated class; project-specific housekeeping, error checking, path construction etc. were removed.

Problem #1 is solved by the parseConfigFile(...) method:

@Option(names = { "configfile", "cf" } )
public void parseConfigfile(final List<String> accumulatedCfgFiles) {
    // prune the list, keeping only "new" entries that haven't been seen yet:
    List<String> newCfgFiles = this.cfgfileHelper.retainNewOptionValuesOnly(
        accumulatedCfgFiles, "configfile");
    if(newCfgFiles.isEmpty()) {
        // another picoCLI quirk: this happens even if there always are values 
        return; 
    } else if(newCfgFiles.size() > 1) {
        // loop over the files if you want to allow this, or report an error
    }
    // some path tinkering left out
    File cfgFile = new File(newCfgFiles.get(0));
    if(this.stackOfConfigFiles.contains(cfgFile)) {
        // report error because of cyclic reference
    } else {
        this.stackOfConfigFiles.push(cfgFile);
        // task a new CommandLine instance with processing that file:
        CommandLine cmd = new CommandLine(this);
        String[] optionsFromFile = FileUtils.readLines(cfgFile); // Apache Commons               
        this.cfgfileHelper.wrapParseArgsCall(cmd, optionsFromFile);
        this.stackOfConfigFiles.pop();
    }
}

The method uses an instance of NestedCfgfileHelper (see source below) which does all the nested-config-specific housekeeping to solve problem #2. Your annotated class needs one helper instance as a public attribute. Call the constructor with the names of all "problem" options (those that the helper class should take care of):

...
public final NestedCfgfileHelper cfgfileHelper = 
    new NestedCfgfileHelper(new String[] { "configfile", "section" });
...

The following steps make all this work:

  • Identify those options that are sensitive to "spurious setter method calls" (most are not);
  • If there are any, paste NestedCfgfileHelper's source to your annotated class as an inner class;
  • Create an instance of NestedCfgfileHelper as a public member of your annotated class, telling the constructor the names of all those "problematic" options;
  • Never call yourInstanceOfCommandLine.parseArgs(...) directly. Instead, instantiate and initialize it, but pass it to the helper using instanceOfYourAnnotatedClass.cfgfileHelper.wrapParseArgs(...)
  • Let the setter method(s) for those "difficult" options...
    • ... first get rid of "old" values from previous invocations by calling retainNewOptionValuesOnly, passing the name of the option;
    • ... then process the remaining option value(s) normally.

Finally, here's the source of NestedCfgfileHelper:

/** NestedCfgfileHelper ensures that the values of certain options are 
 * processed just once, despite the picoCLI quirk.     */
public final class NestedCfgfileHelper {
    /** Maps an (option name|CommandLine instance) pair to the number of 
     * option values that instance has so far passed for that option. 
     * Because Java doesn't have Maps with two keys, it's implemented as 
     * a Map of Maps:         */
    private Map<String, Map<CommandLine, Integer>> mapOptionAndCLToCount =
            new HashMap<>();

    /** Constructs a helper instance and prepares it to handle the options 
     * given as parameters. 
     * 
     * @param optionNames any number of Strings, with each String denoting
     * one option whose values should be protected against being processed
     * multiple times */
    public NestedCfgfileHelper(String... optionNames) {
        // make one mapping for each option name given:
        for(String optionName: optionNames) {
            mapOptionAndCLToCount.put(optionName, new HashMap<CommandLine, Integer>());
        }
    }
        
    /** This stack keeps track of CommandLine instances that are currently 
     * working on this annotated class instance. A stack is needed because 
     * config files can be nested. */
   private Stack<CommandLine> stackOfCmdLineInstances = new Stack<>();

    /** Wraps the call to {@link CommandLine#parseArgs(String...)} with some
     * housekeeping so that when an annotated setter method is being called
     * during option parsing, the helper method can look up from which 
     * CommandLine instance the call is coming.
     * Because parseArg invocations will be nested recursively for nested config
     * files, the respective CommandLine instances are kept on a stack.
     * @param cl CommandLine instance that's been about to start parsing
     * @param args options that are to be parsed     */
    public void wrapParseArgsCall(final CommandLine cl, final String[] args) {
        // the brand new CommandLine instance hasn't passed any values yet,
        // so put 0 in all maps: 
        mapOptionAndCLToCount.forEach(
            (String s, Map<CommandLine, Integer> m) -> m.put(cl, 0));
        this.stackOfCmdLineInstances.push(cl);
        cl.parseArgs(args);
        this.stackOfCmdLineInstances.pop();
    }

    /** This method filters its list parameter, discarding the first n 
     * entries (assuming they've already been processed), where n is retrieved
     * from a Map instance kept for each option name.
     * As a side effect, the updated number of option values is stored. 
     * This method should be called exactly once per invocation of an annotated 
     * setter method, and only by setter methods whose options values shouldn't
     * be set multiple times.
     * 
     * @param accumulated List containing all values (old and new ones 
     * accumulated) of the option named in the other parameter.
     * @param optionName describes the option that's being parsed.
     * @return pruned list containing only the "new" values that haven't
     * been seen before.     */
    private List<String> retainNewOptionValuesOnly(
        final List<String> accumulated, 
        final String optionName) {

        // get the CommandLine instance currently working on this TFConfig instance:
        CommandLine currentCL = this.stackOfCmdLineInstances.peek();

        // get the CommandLine->int map for the option name passed:
        Map<CommandLine, Integer> map = mapOptionAndCLToCount.get(optionName);
        if(map == null) {
            throw new IllegalArgumentException("unknown option: " + optionName);
        }
            
        /* Find out how many option values it has already passed to the setter. */
        int n = map.get(currentCL);
            
        /* discard the first n entries (they have already been processed) of
         * accumulated, keeping only the "new" ones: */ 
        List<String> optionValuesNewThisTime = 
            accumulated.subList(n, accumulated.size());
        // associate the new number of patterns with the current CommandLine:
        int newNumber = n + optionValuesNewThisTime.size();
        map.put(currentCL, newNumber);
        return optionValuesNewThisTime;
    }
}

Upvotes: 0

Related Questions