Reputation: 811
I have a massive string of letters all jumbled up, 1.2k lines long. I'm trying to find a lowercase letter that has EXACTLY three capital letters on either side of it.
This is what I have so far
def scramble(sentence):
try:
for i,v in enumerate(sentence):
if v.islower():
if sentence[i-4].islower() and sentence[i+4].islower():
....
....
except IndexError:
print() #Trying to deal with the problem of reaching the end of the list
#This section is checking if
the fourth letters before
and after i are lowercase to ensure the central lower case letter has
exactly three upper case letters around it
But now I am stuck with the next step. What I would like to achieve is create a for-loop
in range of (-3,4)
and check that each of these letters is uppercase. If in fact there are three uppercase letters either side of the lowercase letter then print this out.
For example
for j in range(-3,4):
if j != 0:
#Some code to check if the letters in this range are uppercase
#if j != 0 is there because we already know it is lowercase
#because of the previous if v.islower(): statement.
If this doesn't make sense, this would be an example output if the code worked as expected
scramble("abcdEFGhIJKlmnop")
OUTPUT
EFGhIJK
One lowercase letter with three uppercase letters either side of it.
Upvotes: 0
Views: 87
Reputation: 2082
regex is probably the easiest, using a modified version of @Israel Unterman's answer to account for the outside edges and non-upper surroundings the full regex might be:
s = 'abcdEFGhIJKlmnopABCdEFGGIddFFansTBDgRRQ'
import re
words = re.findall(r'(?:^|[^A-Z])([A-Z]{3}[a-z][A-Z]{3})(?:[^A-Z]|$)', s)
# words is ['EFGhIJK', 'TBDgRRQ']
using (?:.)
groups keeps the search for beginning of line or non-upper from being included in match groups, leaving only the desired tokens in the result list. This should account for all conditions listed by OP.
(removed all my prior code as it was generally *bad*)
Upvotes: 1
Reputation: 1234
if you can't use regular expression
maybe this for loop can do the trick
if v.islower():
if sentence[i-4].islower() and sentence[i+4].islower():
for k in range(1,4):
if sentence[i-k].islower() or sentence[i+k].islower():
break
if k == 3:
return i
Upvotes: 1
Reputation: 13510
Here is a way to do it "Pythonically" without
regular expressions:
s = 'abcdEFGhIJKlmnop'
words = [s[i:i+7] for i in range(len(s) - 7) if s[i:i+3].isupper() and s[i+3].islower() and s[i+4:i+7].isupper()]
print(words)
And the output is:
['EFGhIJK']
And here is a way to do it with regular expressions,
which is, well, also Pythonic :-)
import re
words = re.findall(r'[A-Z]{3}[a-z][A-Z]{3}', s)
Upvotes: 1