equanimity
equanimity

Reputation: 2533

Splitting a string and retaining the delimiter with the delimiter appearing contiguously

I have the following string:

bar = 'F9B2Z1F8B30Z4'

I have a function foo that splits the string on F, then adds back the F delimiter.

def foo(my_str):
    res = ['F' + elem for elem in my_str.split('F') if elem != '']
    return res

This works unless there are two "F"s back-to-back in the string. For example,

foo('FF9B2Z1F8B30Z4')

returns

['F9B2Z1', 'F8B30Z4']

(the double "F" at the start of the string is not processed)

I'd like the function to split on the first "F" and add it to the list, as follows:

['F', 'F9B2Z1', 'F8B30Z4']

If there is a double "F" in the middle of the string, then the desired behavior would be:

foo('F9B2Z1FF8B30Z4')

['F9B2Z1', 'F', 'F8B30Z4']

Any help would be greatly appreciated.

Upvotes: 2

Views: 66

Answers (2)

user7864386
user7864386

Reputation:

Instead of the filtering if, use slicing instead because an empty string is a problem only at the beginning:

def foo(my_str):
    res = ['F' + elem for elem in my_str.split('F')]
    return res[1:] if my_str and my_str[0]=='F' else res

Output:

>>> foo('FF9B2Z1F8B30Z4')
['F', 'F9B2Z1', 'F8B30Z4']

>>> foo('FF9B2Z1FF8B30Z4FF')
['F', 'F9B2Z1', 'F', 'F8B30Z4', 'F', 'F']

>>> foo('9B2Z1F8B30Z4')
['F9B2Z1', 'F8B30Z4']

>>> foo('')
['F']

Upvotes: 3

MYousefi
MYousefi

Reputation: 1008

Using regex it can be done with

import re

pattern = r'^[^F]+|(?<=F)[^F]*'

The ^[^F]+ captures all characters at the beginning of strings that do not start with F.

(?<=F)[^F]* captures anything following an F so long as it is not an F character including empty matches.

>>> print(['F' + x for x in re.findall(pattern, 'abcFFFAFF')])
['Fabc', 'F', 'F', 'FA', 'F', 'F']

>>> print(['F' + x for x in re.findall(pattern, 'FFabcFA')])
['F', 'Fabc', 'FA']

>>> print(['F' + x for x in re.findall(pattern, 'abc')])
['Fabc']

Note that this returns nothing for empty strings. If empty strings need to return ['F'] then pattern can be changed to pattern = r'^[^F]+|(?<=F)[^F]*|^$' adding ^$ to capture empty strings.

Upvotes: 0

Related Questions