Thomas
Thomas

Reputation: 249

python: conditinoally splitting a string

there are lots of questions about split in python, but I couldn't find corresponding to my problem. I want to split a string, but need to have different values for the splitter, depending on a condition. For the test case, my string is "11xx22xx33xxBEGINxx44xx55xxENDxx66xx77". I want to process this string in chunks, meaning I want to step through it like this:

split off '11', do something with it

split off '22', do something with it

split off '33', do something with it

split off 'BEGINxx44xx55xxEND', do something with it

split off '66', do something with it

split off '77', do something with it

I tried a recursive function:

import string

mystring = "11xx22xx33xxBEGINxx44xx55xxENDxx66xx77"

def makechunks(s):
    try: splitter
    except NameError:
        splitter = "xx"
    whole = s.split(splitter, 1)
    current = whole[0]
    try: whole[1]
    except NameError:
        return
    else:
        rest = whole[1]
        if current.find("BEGIN", 0, 5):
            splitter = "END"
        else:
            splitter = "xx"
        makechunks(rest)
        print("AA", current, "BB")

makechunks(mystring)

But I'm getting the error "list index out of range." Maybe my entire approach is flawed, and there are better ways to achieve what I want? I'll be grateful for any hint.

Thanks!

Upvotes: 2

Views: 172

Answers (2)

aleph_null
aleph_null

Reputation: 5786

What about splitting them all and then joining all the ones between BEGIN and END?

ssplit = mystring.split("xx")
bIndex = ssplit.index("BEGIN")
eIndex = ssplit.index("END")
bend = "xx".join(ssplit[bIndex:eIndex+1])
others = ssplit[:bIndex] + ssplit[eIndex+1:]

now you have your BEGIN..END token in 'bend' and the remaining tokens in 'others'

Upvotes: 1

Mark Byers
Mark Byers

Reputation: 839124

You can do it with a regular expression:

matches = re.findall('(?:^|xx)(BEGIN.*?END|.*?)(?=xx|$)', s)

Result:

['11', '22', '33', 'BEGINxx44xx55xxEND', '66', '77']

See it working online: ideone

Upvotes: 5

Related Questions