poussma
poussma

Reputation: 7301

How to split a command line like string?

Basically, I need to split the string like

"one quoted argument" those are separate arguments "but not \"this one\""

to get in result the list of arguments

This regex "(\"|[^"])*"|[^ ]+ nearly does the job but the issue is that regular expression always (at least in java) tries to match the longest string possible.

In consequence, when I apply the regex to a string that starts and ends with a quoted arguments, it matches the whole string and does not create a group for each argument.

Is there a way to tweak this regex or the matcher or the pattern or whatever to handle that?

Note: don't tell me I could use GetOpt or CommandLine.parse or anything else similar.
My concern is about pure java regex (if possible but I doubt it...).

Upvotes: 3

Views: 2929

Answers (4)

Markus Duft
Markus Duft

Reputation: 211

I came up with this one (thanks Alex for giving me the good starting point :))

/**
 * Pattern that is capable of dealing with complex command line quoting and
 * escaping. This can recognize correctly:
 * <ul>
 * <li>"double quoted strings"
 * <li>'single quoted strings'
 * <li>"escaped \"quotes within\" quoted string"
 * <li>C:\paths\like\this or "C:\path like\this"
 * <li>--arguments=like_this or "--args=like this" or '--args=like this' or
 * --args="like this" or --args='like this'
 * <li>quoted\ whitespaces\\t (spaces & tabs)
 * <li>and probably more :)
 * </ul>
 */
private static final Pattern cliCracker = Pattern
    .compile(
       "[^\\s]*\"(\\\\+\"|[^\"])*?\"|[^\\s]*'(\\\\+'|[^'])*?'|(\\\\\\s|[^\\s])+",
       Pattern.MULTILINE);

Upvotes: 2

Igor
Igor

Reputation: 21

public static String[] parseCommand( String cmd )
{
    if( cmd == null || cmd.length() == 0 )
    {
        return new String[]
        {};
    }

    cmd = cmd.trim();
    String regExp = "\"(\\\"|[^\"])*?\"|[^ ]+";
    Pattern pattern = Pattern.compile( regExp, Pattern.MULTILINE | Pattern.CASE_INSENSITIVE );
    Matcher matcher = pattern.matcher( cmd );
    List< String > matches = new ArrayList< String >();
    while( matcher.find() ) {
        matches.add( matcher.group() );
    }
    String[] parsedCommand = matches.toArray(new String[] {});
    return parsedCommand;
}

Upvotes: 2

Alex
Alex

Reputation: 25613

You may use the non greedy qualifier *? to make it work:

"(\\"|[^"])*?"|[^ ]+

See this link for an example in action: http://gskinner.com/RegExr/?32srs

Upvotes: 4

eis
eis

Reputation: 53462

regular expression always (at least in java) tries to match the longest string possible.

Um... no.

That is controlled by if you use greedy or non-greedy expressions. See some examples. Using a non-greedy one (by adding a question mark) should do it. It's called lazy quantification.

The default is greedy, but it certainly doesn't mean it is always that way.

Upvotes: 4

Related Questions