Reputation: 113445
I'm interested how is parsed the bash input into arguments.
For example, by using process.argv
we get an array of strings in NodeJS (but this is language agnostic).
My question is how can I parse an input like "node foo.js --foo "bar baz" -b foo"
into an array like process.argv
(or the equivalent in other languages) returns (e.g. ["node", "foo.js", "--foo", "\"bar baz\"", "-b", "foo"]
?
Splitting by space is not enough (because of the quotes). Is it possible with some more complicated regex to handle the quotes and getting such an array?
Upvotes: 7
Views: 2552
Reputation: 11051
Since a regex solution seems to be explicitly requested, while this is the kind of task for a proper parser, here's a regex one-liner for the thrills.
Considering the specifications:
"..."
or '...'
togetherAn simple match
function can be used to find values, with the downside that nested escaping of quotes will not be detected well (Recursive matching has been difficult with regexes.)
>>> str = "node foo.js --foo \"bar baz\" -b foo";
str.match(/"[^"]+"|'[^']+'|\S+/g)
<<< ["node", "foo.js", "--foo", "\"bar baz\"", "-b", "foo"]
(Simplified) Regex explanation:
"[^"]+"|'[^']+'
is a subpattern that looks for pairs of quotes with anything other than the quotes themselves in between.|
Alternates to another option.\S
is a negation for \s
: It matches non-whitespace sequences, which effectively asserts we match tokens which aren't previously collected. The +
quantifies the entire string.Upvotes: 5
Reputation: 4339
Using the shell-quote NPM package will handle this.
var parse = require('shell-quote').parse;
parse('node foo.js --foo "bar baz" -b foo');
[ 'node', 'foo.js', '--foo', 'bar baz', '-b', 'foo' ]
Upvotes: 6