Reputation: 1588
I'm making a basic language. Well, not exactly, but you'll see. Now, I did echo and exit commands, but I need help.
If I give it a string 'echo "hello bob"' I want it to split it up, and give me an array like so [echo, Hello Bob]. Now, I have echo working, but with only ONE word. So I can do --> 'echo bob', and it will output 'bob'. But, If I do 'echo hi bob' it will output 'hi'. And I always want it to do that. If I have a command foo, I want to do 'foo "bar face" boo' and get [foo, bar face, boo]. So basically I want to do myArr.split(' ') except for anything in between quotes. How can I do this?
Upvotes: 0
Views: 102
Reputation: 22113
Here is a simple answer:
>>> import shlex
>>> shlex.split('echo "hello bob"')
['echo', 'hello bob']
shlex
is a module that helps with parsing shell-like languages.
The documentation can be found here (thank you, JIStone): http://docs.python.org/library/shlex.html
Upvotes: 4
Reputation: 34698
Here is a simple tokenizer
import re
def s_ident(scanner, token): return token
def s_operator(scanner, token): return "op%s" % token
def s_float(scanner, token): return float(token)
def s_int(scanner, token): return int(token)
scanner = re.Scanner([
(r"[a-zA-Z_]\w*", s_ident),
(r"\d+\.\d*", s_float),
(r"\d+", s_int),
(r"=|\+|-|\*|/", s_operator),
(r"\s+", None),
])
print scanner.scan("sum = 3*foo + 312.50 + bar")
You will need a parser to actually use this lex'd content
Upvotes: 1