Reputation: 440
str = "cmd -opt1 { a b c d e f g h } -opt2"
I want output like this:
[ 'cmd', '-opt1', '{ a b c d e f g h }', '-opt2' ]
Upvotes: 3
Views: 3012
Reputation: 89567
In this situation, don't try to split, use re.findall
:
>>> import re
>>> re.findall(r'{[^}]*}|\S+', 'cmd -opt1 { a b c d e f g h } -opt2')
['cmd', '-opt1', '{ a b c d e f g h }', '-opt2']
if you have to deal with nested curly brackets, the re module doesn't suffice, you need to use the "new" regex module that has the recursion feature.
>>> import regex
>>> regex.findall(r'[^{}\s]+|{(?:[^{}]+|(?R))*+}', 'cmd -opt1 { a b {c d} e f} -opt2')
['cmd', '-opt1', '{ a b {c d} e f}', '-opt2']
Where (?R)
refers to the whole pattern itself.
or this one (that is better):
regex.findall(r'[^{}\s]+|{[^{}]*+(?:(?R)[^{}]*)*+}', 'cmd -opt1 { a b {c d} e f} -opt2')
Upvotes: 5
Reputation: 41509
Do take a look at the argparse
module, since I assume you are writing code to parse the arguments of your program. Normally these arguments are stored in sys.argv
, so you don't even need to care about splitting the command line string. If you insist on using the command line, you may convert your argument string to an argument list with the str.split
method.
import argparse
parser = argparse.ArgumentParser(description='whatever cmd does.')
parser.add_argument('--opt1', metavar='N', type=int, nargs='+',
help='integers')
options = parser.parse_args()
for n in options.opt1:
# do something with n
Upvotes: 1
Reputation: 19040
Just split on the {
and }
then split the separate parts by a regular space:
str = "cmd -opt1 { a b c d e f g h } -opt2"
>>> a, b = str.split("{")
>>> c, d = b.split("}")
>>> a.split() + ["{{{0}}}".format(c)] + d.split()
['cmd', '-opt1', '{ a b c d e f g h }', '-opt2']
Upvotes: 0
Reputation: 67968
\s+(?![^{]*})
You can split by this.See demo.
https://regex101.com/r/jV9oV2/6
Upvotes: 3