Reputation: 51
I want in this part of my code, to cut out any non alphabetical symbol in the words I get from a read file.
I get that there is probably an empty string being tested on, that the error is happening,
but I can't figure out why after numerous different codes I tried.
Here's what I have now for it:
for i in given_file:
cut_it_out = True
while cut_it_out:
if len(i) == 0:
cut_it_out = False
else:
while (len(i) != 0) and cut_it_out:
if i.lower()[0].isalpha() and i.lower()[len(i) - 1].isalpha():
cut_it_out = False
if (not i.lower()[len(i) - 1].isalpha()):
i = i[:len(i) - 2]
if (not i.lower()[0].isalpha()):
i = i[1:]
Can anyone help me figure this out? thanks.
Thanks for the interesting answers :), I want it to be even more precise, but there is an endless loop problem on I can't seem to get rid of.
Can anyone help me figure it out?
all_words = {} # New empty dictionary
for i in given_file:
if "--" in i:
split_point = i.index("--")
part_1 = i[:split_point]
part_2 = i[split_point + 2:]
combined_parts = [part_1, part_2]
given_file.insert(given_file.index(i)+2, str(part_1))
given_file.insert(given_file.index(part_1)+1, str(part_2))
#given_file.extend(combined_parts)
given_file.remove(i)
continue
elif len(i) > 0:
if i.find('0') == -1 and i.find('1') == -1 and i.find('2') == -1 and i.find('3') == -1 and i.find('4') == -1\
and i.find('5') == -1 and i.find('6') == -1 and i.find('7') == -1 and i.find('8') == -1 and i.find('9') == -1:
while not i[:1].isalpha():
i = i[1:]
while not i[-1:].isalpha():
i = i[:-1]
if i.lower() not in all_words:
all_words[i.lower()] = 1
elif i.lower() in all_words:
all_words[i.lower()] += 1
Upvotes: 1
Views: 258
Reputation: 82929
There are a few problems with your code:
if
can strip away the last character in a string of all non-alpha characters, and then the third if
will produce an exception.break
instead of that boolean variablei.lower()[x]
is non-alpha, so is i[x]
; also, better use i[-1]
for the last indexAfter fixing those issues, but keeping the general idea the same, your code becomes
while len(i) > 0:
if i[0].isalpha() and i[-1].isalpha():
break
if not i[-1].isalpha():
i = i[:-1]
elif not i[0].isalpha(): # actually, just 'else' would be enough, too
i = i[1:]
But that's still a bit hard to follow. I suggest using two loops for the two ends of the string:
while i and not i[:1].isalpha():
i = i[1:]
while i and not i[-1:].isalpha():
i = i[:-1]
Or you could just use a regular expression, somethink like this:
i = re.sub(r"^[^a-zA-Z]+|[^a-zA-Z]+$", "", i)
This reads: Replace all (+
) characters that are not ([^...]
) in the group a-zA-Z
that are directly after the start of the string (^
) or (|
) before the string's end ($
) with ""
.
Upvotes: 1
Reputation: 3571
I think your problem is a consequence of an over complicated solution. The error was pointed by @tobias_k. And anyway your code can be very inefficient. Try to simplify, for example try: (I have not tested yet)
for i in given_file:
beg=0
end=len(i)-1
while beg<=end and not i[beg].isalpha():
beg=beg+1
while beg<=end and not i[end].isalpha():
end=end-1
res=""
if beg<=end:
res=i[beg:end]
Upvotes: 1