Reputation: 299
While preparing for a test, I am solving tests from previous years.
Write the function compress(lst)
that receives a non empty list of repetitive letters and returns a list of tuples, each tuple containing the letter and the number or subsequent repetitions. ( see example)
e.g.:
for:
['a','a', 'b', 'b', 'b', 'c', 'a', 'a']
the function should return:
[('a', 2), ('b', 3), ('c', 1), ('a', 2)]
Here's my code:
def compress(lst):
res = []
i = 0
for letter in lst:
letter_count = 0
while i < len(lst) and lst[i] == letter:
letter_count += 1
i +=1
res.append((letter, letter_count))
return res
My function returns:
[('a', 2), ('a', 0), ('b', 3), ('b', 0), ('b', 0), ('c', 1), ('a', 2), ('a', 0)]
I can see why it does that, but I cant see how to change my code to solve the problem.
Upvotes: 3
Views: 258
Reputation: 140246
groupby
is the pythonic way. If you want to emulate that, you can do it with O(1)
with only one loop, no indexes (not very pythonic) and a variable to store the previous letter:
def compress(lst):
res = []
previous_letter = None
count = 0
for letter in lst:
if previous_letter is None:
# first time
count += 1
previous_letter = letter
elif previous_letter == letter:
# same letter as before, count
count += 1
else:
# different letter: store result, reset counter
res.append((previous_letter,count))
count = 1
previous_letter = letter
# last iteration doesn't append the last result, fix this
res.append((previous_letter,count))
return res
result:
[('a', 2), ('b', 3), ('c', 1), ('a', 2)]
Upvotes: 1
Reputation: 114440
The issue is that the while
loop correctly counts occurrences, the for
loop marches on inexorably, one character at a time. Since you're already incrementing the index correctly in the while
loop, the simplest thing would be to get rid of either the for
or while
loop entirely. The only purpose to having multiple loops here is to attempt to avoid an if
, and as you see, that doesn't really work.
res = []
prev = ''
for letter in lst:
if prev == letter:
letter_count += 1
else:
if prev:
res.append((prev, letter_count))
letter_count = 1
prev = letter
This is similar to what you'd get from itertools.groupby
:
[(k, len(list(g))) for k, g in groupby(lst)]
Upvotes: 1
Reputation: 26039
Use groupby
from itertools
module which perfectly fits here:
from itertools import groupby
lst = ['a','a', 'b', 'b', 'b', 'c', 'a', 'a']
print([(k, len(list(v))) for k, v in groupby(lst)])
# [('a', 2), ('b', 3), ('c', 1), ('a', 2)]
Upvotes: 3
Reputation: 59274
(Assuming you want to correct your code)
Save the index
of the current analyzed letter and just sum to it how many times this letter repeats, so you skip already analyzed letters
def compress(lst):
res = []
i = 0
ind = 0
while ind < len(lst):
letter_count = 0
while i < len(lst) and lst[i] == lst[ind]:
letter_count += 1
i +=1
res.append((lst[ind], letter_count))
ind += letter_count
return res
>>> compress(['a','a', 'b', 'b', 'b', 'c', 'a', 'a'])
[('a', 2), ('b', 3), ('c', 1), ('a', 2)]
Upvotes: 2