Reputation: 53
I have a text which contains many lines.
I want to split it based on string which ends with specific character.
For Ex: My text contains below data
Hi
I'm here:
London
UK
USA
Where are you:
here
there
what will you do:
something
somethin2
I want to split this text into a list with delimiter as a string that ends with
colon - :
In this case the final result list would be
[ Hi, London UK USA, here there, something somethin2 ]
How do I do that in python?
I'm aware that we can split with a single character or some other string which is common delimiter. But what to do in this case?
Upvotes: 2
Views: 968
Reputation: 103844
You can use a regex split:
>>> import re
>>> [s.strip().replace('\n',' ') for s in re.split(r'^.*:$',txt, flags=re.M)]
['Hi', 'London UK USA', 'here there', 'something somethin2']
The regex ^.*:$
finds full lines ending in a :
And re.splits
splits the string on that pattern and deletes the delimiting line. Then replace \n
with ' '
in each string block and you have the desired output.
Upvotes: 0
Reputation: 3447
Here is a small example of how this can be done.
Note: Easier to understand but far less efficient than @Ajax1234 's answer.
text = '''Hi
I'm here:
London
UK
USA
Where are you:
here
there
what will you do:
something
somethin2'''
# add comma if there is ':' or else insert the line
output = [line.strip() if ':' not in line else ',' for line in text.split('\n')]
# join the list on space
output = ' '.join(output)
# split back into list on ',' and trim the white spaces
output = [item.strip() for item in output.split(',')]
print(output)
Outputs:
['Hi', 'London UK USA', 'here there', 'something somethin2']
Upvotes: 0
Reputation: 71451
You can use itertools.groupby
:
import itertools
data = [[a, list(b)] for a, b in itertools.groupby(content.split('\n'), key=lambda x:x.endswith(':'))]
final_result = [' '.join(b) for a, b in data if not a]
Output:
['Hi', 'London UK USA', 'here there', 'something somethin2']
Upvotes: 4