Reputation: 29
Write a python function which performs the run length encoding for a given String and returns the run length encoded String.
I tried it using looping but couldn't get the expected output.
def encode(message):
#Remove pass and write your logic here
count=0
encoded_message=[]
for char in range(0,len(message)-1,1):
count=1
while(message[char]==message[char+1]):
count=count+1;
char=char+1
encoded_message.append(str(count)+message[char])
return encoded_message
encoded_message=encode("ABBBBCCCCCCCCAB")
print(' '.join(encoded_message))
expected output is 1A4B8C1A1B
.
what I got is 1A 4B 3B 2B 1B 8C 7C 6C 5C 4C 3C 2C 1C 1A
Upvotes: 1
Views: 2654
Reputation: 11228
def func(string):
string +='@'
dic = []
tmp =[]
tmp += [string[0]]
for i in range(1,len(string)):
if string[i]==string[i-1]:
tmp.append(string[i])
else:
dic.append(tmp)
tmp=[]
tmp.append(string[i])
res = ''.join(['{}{}'.format(len(i),i[0]) for i in dic])
return res
string = 'ABBBBCCCCCCCCAB'
solution = func(string)
print(solution)
output
1A4B8C1A1B
Upvotes: 0
Reputation: 6920
You can use groupby
from itertools
module :
s = "ABBBBCCCCCCCCAB"
from itertools import groupby
expected = ''.join([str(len(list(v)))+k for k,v in groupby(s)])
Output :
'1A4B8C1A1B'
groupby(s)
returns a itertools.groupby
object. A list comprehension on this object like [(k,list(v)) for k,v in groupby(s)]
returns us this in ordered way :
[('A', ['A']), ('B', ['B', 'B', 'B', 'B']), ('C', ['C', 'C', 'C', 'C', 'C', 'C', 'C', 'C']), ('A', ['A']), ('B', ['B'])]
We can just count the number of sub-items in the second-item of the tuple and add its string format before the first item of the tuple and join all of them.
Update :
You are trying to change the iteration index in the loop by doing char=char+1
but it doesn't change the iteration index i.e. the loop doesn't pass for the next 2 or 3 or 4 iterations. Add these two print lines in your code and you would see that the char
variable you're trying to increase while looping is not simply the iteration index :
...
for char in range(0,len(message)-1,1):
print('\tchar at first line : ', char, 'char id now : ', id(char))
count=1
while(message[char]==message[char+1]):
count=count+1
char=char+1
print('char now : ', char, 'char id now : ', id(char))
...
It should output something like :
char at first line : 1 char id now : 11197408
char now : 2 char id now : 11197440
char now : 3 char id now : 11197472
char now : 4 char id now : 11197504
See, how the id
of each time char
got changed.
Upvotes: 5
Reputation: 7812
If you want to fix your function, here is fixed variant:
def encode(message):
result = []
i = count = 0
while i < len(message) - 1:
count = 1
while i + count < len(message) and message[i + count - 1] == message[i + count]:
count += 1
i += count
result.append("{}{}".format(count, message[i - 1]))
if count == 1:
result.append("1" + message[-1])
return result
What's changed:
range(0,len(message)-1,1)
returns you list [0, 1, 2, ...]
and it doesn't matter what you do with char
variable incide loop, it won't affect next iteration. To have a possibility skip some indexes I used while loop with predefined ( i = count = 0
) index and count variables.message[i + count - 1] == message[i + count]
- check if next symbol same with current;i + count < len(message)
- prevent intenal loop from accessing index out of range.i
) outside of internal loop.if count == 1:
added post condition after loop execution to not miss last character in case if it's single.Upvotes: 0
Reputation: 579
Use this logic, it will return you a dictionary with frequency of each letter.
s = "ABBBBCCCCCCCCAB"
d = {i:0 for i in s}
for i in s:
d[i] += 1
print(d)
**output:-**
{'A': 2, 'B': 5, 'C': 8}
Upvotes: 0
Reputation: 195438
You can also use re
module for encoding the string:
s = 'ABBBBCCCCCCCCAB'
import re
l = ''.join(str(len(c2)+1) + c1 for c1, c2 in re.findall(r'([A-Z])(\1*)', s))
print(l)
Prints:
1A4B8C1A1B
Upvotes: 0