Reputation: 13
Is there a way to split a string on either of two delimiters.
Code example:
# word --> u1 or word --> u2
a = "Hi thereu1hello ?u1Whatu2Goodu1Work worku2Stacku2"
# here we must split this string with two words "u1" && "u2" and insert them in 2 list like this
u1 = ["Hi there", "hello ?", "Good"]
u2 = ["What", "Work work", "Stack"]
Upvotes: 1
Views: 2040
Reputation: 51643
You can iterate the string character-wise and accumulate characters in a part
-list until your last char in that list is 'u'
and your current char is '1'
or '2'
.
You then join the part
-list together again, omitting its last character (the 'u'
) and stuff it either in u1
or u2
and clear part
:
a = "Hi thereu1hello ?u1Whatu2Goodu1Work worku2Stacku2"
u1 = []
u2 = []
part = []
# iterate your string character-wise
for c in a:
# last character collected == u and now 1 or 2?
if part and part[-1] == "u" and c in ["1","2"]:
if c == "1":
u1.append(''.join(part[:-1])) # join all collected chars, omit 'u'
part=[]
else:
u2.append(''.join(part[:-1])) # see above, same.
part=[]
else:
part.append(c)
# you have no end-condition if your string ends on neither u1 nor u2 the
# last part of your string is not added to any u1 or u2
print(u1)
print(u2)
Output:
['Hi there', 'hello ?', 'Good']
['What', 'Work work', 'Stack']
Second way to go would be to remembers certain indexes (where ended last slice, where are we now) and just slice the correct part from the input:
u1 = []
u2 = []
oldIdx = 0 # where to start slicing, update on append to either u1 or u2
lastOne = "" # character in last iteration
for i,c in enumerate(a): # get the index (i) and the character (c) from enumerate
if lastOne == "u" and c in ["1","2"]:
if c == "1":
u1.append(a[oldIdx:i-1]) # slice the correct part from a
else:
u2.append(a[oldIdx:i-1]) # slice the correct part from a
oldIdx = i+1 # update slice starting position
lastOne = "" # reset last one
else:
lastOne = c # remeber char as lastOne
You do not need as much "memory/time" to store a single integer and a character as you need when storing / appending to a part
list - you also do not need to join the parts
for appending as you directly slice from the source - so its slightly more efficient.
Upvotes: 2
Reputation: 43136
You can use regex to implement a trivially extensible solution:
import re
a = "Hi thereu1hello ?u1Whatu2Goodu1Work worku2Stacku2"
separators = ['u1', 'u2']
regex = r'(.*?)({})'.format('|'.join(re.escape(sep) for sep in separators))
result = {sep: [] for sep in separators}
for match in re.finditer(regex, a, flags=re.S):
text = match.group(1)
sep = match.group(2)
result[sep].append(text)
print(result)
# {'u1': ['Hi there', 'hello ?', 'Good'],
# 'u2': ['What', 'Work work', 'Stack']}
This constructs a regex out of the separators u1
and u2
like so:
(.*?)(u1|u2)
And then it iterates over all matches of this regex and appends them to the corresponding list.
Upvotes: 1