Reputation: 1291
Suppose I have the string:
string = "this is a test string <LW> I want to <NL>split this string<NL> by each tag I have inserted.<AB>"
I want to split the string by each custom tag I have inserted in the string in a previous function:
tags = ["<LW>", "<NL>", "<AB>"]
This is the desired output:
splitString = splitByTags(string, tags)
for s in splitString:
print(s)
Output
"this is a test string <LW>"
" I want to <NL>"
"split this string<NL>"
" by each tag I have inserted.<AB>"
So basically I want to split the string by multiple substrings while keeping these substrings within the split. What is the quickest and most efficient way of doing this? I am aware that I can use string.split and simply append the split text to each line however I am unsure how to do this with multiple strings.
Upvotes: 1
Views: 575
Reputation:
Here some example how to do this:
import re
def split_string(string, tags):
string_list = []
start = 0
for tag in tags:
tag_index = re.finditer(tag, string)
for item in tag_index:
end_tag = item.start() + len(tag)
string_list.append(string[start:end_tag])
start = end_tag
return string_list
data = split_string(string, tags)
Output:
['this is a test string <LW>', ' I want to <NL>', 'split this string<NL>', ' by each tag I have inserted.<AB>']
Upvotes: 0
Reputation: 82765
Using re.split
with capturing parentheses.
Ex:
import re
string = "this is a test string <LW> I want to <NL>split this string<NL> by each tag I have inserted.<AB>"
tags = ["<LW>", "<NL>", "<AB>"]
splt_str = re.split("(" + "|".join(tags) + ")", string)
for i in range(0, len(splt_str), 2):
print("".join(splt_str[i:i+2]))
Output:
this is a test string <LW>
I want to <NL>
split this string<NL>
by each tag I have inserted.<AB>
Upvotes: 3