Ionică Bizău
Ionică Bizău

Reputation: 113445

Convert a string into shell arguments

I'm interested how is parsed the bash input into arguments.

For example, by using process.argv we get an array of strings in NodeJS (but this is language agnostic).

My question is how can I parse an input like "node foo.js --foo "bar baz" -b foo" into an array like process.argv (or the equivalent in other languages) returns (e.g. ["node", "foo.js", "--foo", "\"bar baz\"", "-b", "foo"]?

Splitting by space is not enough (because of the quotes). Is it possible with some more complicated regex to handle the quotes and getting such an array?

Upvotes: 7

Views: 2552

Answers (2)

Unihedron
Unihedron

Reputation: 11051

Since a regex solution seems to be explicitly requested, while this is the kind of task for a proper parser, here's a regex one-liner for the thrills.

Considering the specifications:

  • JS-compatible
  • Tokenize by spaces but keep "..." or '...' together

An simple match function can be used to find values, with the downside that nested escaping of quotes will not be detected well (Recursive matching has been difficult with regexes.)

>>> str = "node foo.js --foo \"bar baz\" -b foo";
    str.match(/"[^"]+"|'[^']+'|\S+/g)
<<< ["node", "foo.js", "--foo", "\"bar baz\"", "-b", "foo"]

(Simplified) Regex explanation:

  • "[^"]+"|'[^']+' is a subpattern that looks for pairs of quotes with anything other than the quotes themselves in between.
  • | Alternates to another option.
  • \S is a negation for \s: It matches non-whitespace sequences, which effectively asserts we match tokens which aren't previously collected. The + quantifies the entire string.

Upvotes: 5

James Thomas
James Thomas

Reputation: 4339

Using the shell-quote NPM package will handle this.

var parse = require('shell-quote').parse;
parse('node foo.js --foo "bar baz" -b foo');

[ 'node', 'foo.js', '--foo', 'bar baz', '-b', 'foo' ]

Upvotes: 6

Related Questions