Thor Correia
Thor Correia

Reputation: 1588

Splitting strings in complicated ways?

I'm making a basic language. Well, not exactly, but you'll see. Now, I did echo and exit commands, but I need help.

If I give it a string 'echo "hello bob"' I want it to split it up, and give me an array like so [echo, Hello Bob]. Now, I have echo working, but with only ONE word. So I can do --> 'echo bob', and it will output 'bob'. But, If I do 'echo hi bob' it will output 'hi'. And I always want it to do that. If I have a command foo, I want to do 'foo "bar face" boo' and get [foo, bar face, boo]. So basically I want to do myArr.split(' ') except for anything in between quotes. How can I do this?

Upvotes: 0

Views: 102

Answers (2)

Matt
Matt

Reputation: 22113

Here is a simple answer:

>>> import shlex
>>> shlex.split('echo "hello bob"')
['echo', 'hello bob']

shlex is a module that helps with parsing shell-like languages.

The documentation can be found here (thank you, JIStone): http://docs.python.org/library/shlex.html

Upvotes: 4

Jakob Bowyer
Jakob Bowyer

Reputation: 34698

Here is a simple tokenizer

import re

def s_ident(scanner, token): return token
def s_operator(scanner, token): return "op%s" % token
def s_float(scanner, token): return float(token)
def s_int(scanner, token): return int(token)

scanner = re.Scanner([
    (r"[a-zA-Z_]\w*", s_ident),
    (r"\d+\.\d*", s_float),
    (r"\d+", s_int),
    (r"=|\+|-|\*|/", s_operator),
    (r"\s+", None),
    ])

print scanner.scan("sum = 3*foo + 312.50 + bar")

You will need a parser to actually use this lex'd content

Upvotes: 1

Related Questions