Reputation: 9019
I want to be able to split a list of items when reaching a capitalized word, for example:
Input:
s = ['HARRIS', 'second', 'caught', 'JONES', 'third', 'Smith', 'stole', 'third']
Output:
['HARRIS', 'second', 'caught']
['JONES', 'third']
['Smith', 'stole', 'third']
Would it be best to approach this problem using s.index('some regex') and then split the list accordingly at those given indices?
Upvotes: 1
Views: 119
Reputation: 75
str.istitle("Abc") #True
str.istitle("ABC") #False
str.istitle("ABc") #False
str.isupper("Abc") #False
str.isupper("ABC") #True
str.isupper("ABc") #False
So I think it will help you Checking if first letter of string is in uppercase
a = "Abc"
print(str.isupper(a[0]))
or
a = "Abc"
print(a[0].isupper())
Upvotes: 0
Reputation: 22963
If your willing to use a third-party library, you can use iteration_utilities.Iterable
to easily accomplish this:
>>> from iteration_utilities import Iterable
>>>
>>> lst = ['HARRIS', 'second', 'caught', 'JONES', 'third', 'Smith', 'stole', 'third']
>>> Iterable(lst).split(str.isupper, keep_after=True).filter(lambda l: l).as_list()
[['HARRIS', 'second', 'caught'], ['JONES', 'third', 'Smith', 'stole', 'third']]
Upvotes: 1
Reputation: 3818
A straight forward way is to enumerate the list, when founding a Capital, we start a new list, otherwise append.
s = ['HARRIS', 'second', 'caught', 'JONES', 'third', 'Smith', 'stole', 'third', 'H']
def split_by(lst, p):
lsts = []
for x in lst:
if p(x):
lsts.append([x])
else:
lsts[-1].append(x)
return lsts
print(split_by(s, str.isupper))
Upvotes: 0
Reputation: 71461
You can try this:
s = ['HARRIS', 'second', 'caught', 'JONES', 'third', 'Smith', 'stole', 'third']
indices = [i for i, a in enumerate(s) if a[0].isupper()]
indices.append(len(s))
final_list = [s[indices[i]:indices[i+1]] for i in range(len(indices)-1)]
Output:
[['HARRIS', 'second', 'caught'], ['JONES', 'third'], ['Smith', 'stole', 'third']]
Note that this solution only works when the first letter in a certain element is uppercase.
If you want a solution where any letter can be capitalized:
s = ['HARRIS', 'second', 'caught', 'JONES', 'third', 'Smith', 'stole', 'third']
indices = [i for i, a in enumerate(s) if any(b.isupper() for b in a)]
indices.append(len(s))
final_list = [s[indices[i]:indices[i+1]] for i in range(len(indices)-1)]
Upvotes: 2