Sukumar
Sukumar

Reputation: 53

How to split a list of strings based on delimiter string that ends with specific character in Python?

I have a text which contains many lines.

I want to split it based on string which ends with specific character.

For Ex: My text contains below data

Hi
I'm here:
London
UK
USA
Where are you:
here 
there
what will you do:
something
somethin2

I want to split this text into a list with delimiter as a string that ends with

colon - :

In this case the final result list would be [ Hi, London UK USA, here there, something somethin2 ] How do I do that in python?

I'm aware that we can split with a single character or some other string which is common delimiter. But what to do in this case?

Upvotes: 2

Views: 968

Answers (3)

dawg
dawg

Reputation: 103844

You can use a regex split:

>>> import re
>>> [s.strip().replace('\n',' ') for s in re.split(r'^.*:$',txt, flags=re.M)] 
['Hi', 'London UK USA', 'here there', 'something somethin2']

The regex ^.*:$ finds full lines ending in a :

Demo

And re.splits splits the string on that pattern and deletes the delimiting line. Then replace \n with ' ' in each string block and you have the desired output.

Upvotes: 0

Vineeth Sai
Vineeth Sai

Reputation: 3447

Here is a small example of how this can be done.

Note: Easier to understand but far less efficient than @Ajax1234 's answer.

text = '''Hi
I'm here:
London
UK
USA
Where are you:
here 
there
what will you do:
something
somethin2'''

# add comma if there is ':' or else insert the line
output = [line.strip() if ':' not in line else ',' for line in text.split('\n')] 

# join the list on space
output = ' '.join(output) 

# split back into list on ',' and trim the white spaces
output = [item.strip() for item in output.split(',')]

print(output)

Outputs:

['Hi', 'London UK USA', 'here there', 'something somethin2']

Upvotes: 0

Ajax1234
Ajax1234

Reputation: 71451

You can use itertools.groupby:

import itertools
data = [[a, list(b)] for a, b in itertools.groupby(content.split('\n'), key=lambda x:x.endswith(':'))]
final_result = [' '.join(b) for a, b in data if not a]

Output:

['Hi', 'London UK USA', 'here there', 'something somethin2']

Upvotes: 4

Related Questions