Reputation: 348
Out of a string like this "A B c de F G A"
I would like to get the following list: ["A B", "F G A"]
. That means, I need to get all the sequences of uppercase words.
I tried something like this:
text = "A B c de F G A"
result = []
for i, word in enumerate(text.split()):
if word[0].isupper():
s = ""
while word[0].isupper():
s += word
i += 1
word = text[i]
result.append(s)
But it produces a the following output: ['A', 'BB', 'F', 'G', 'A']
I suppose it happens because you can't skip a list element by just incrementing i
. How can I avoid this situation and get the right output?
Upvotes: 0
Views: 4081
Reputation: 165
The following example will extract all uppercase words following each other from a string:
string="A B c de F G A"
import re
[val for val in re.split('[a-z]*',string.strip()) if val != " "]
Upvotes: 0
Reputation: 22314
You can use re.split
to split a string with a regex.
import re
def get_upper_sequences(s):
return re.split(r'\s+[a-z][a-z\s]*', s)
>>> get_upper_sequences( "A B c de F G A")
['A B', 'F G A']
Upvotes: 1
Reputation: 43504
Here is solution without itertools
or re
:
def findTitles(text):
filtered = " ".join([x if x.istitle() else " " for x in text.split()])
return [y.strip() for y in filtered.split(" ") if y]
print(findTitles(text="A B c de F G A"))
#['A B', 'F G A']
print(findTitles(text="A Bbb c de F G A"))
#['A Bbb', 'F G A']
Upvotes: 0
Reputation: 71451
You can use itertools.groupby
:
import itertools
s = "A B c de F G A"
new_s = [' '.join(b) for a, b in itertools.groupby(s.split(), key=str.isupper) if a]
Output:
['A B', 'F G A']
Upvotes: 7