Reputation: 27
In my program, whenever I have a single character, it will have a count of 1. For example, if I had abbbbbc, I would get back a1b5c1. I don't want single characters to have a count. I like the program to read as ab5c. program below:
def rle(character_string):
compressed_string=""
count = 1
for i in range(len(character_string)-1):
if character_string[i]== character_string[i+1]:
count+=1
else:
compressed_string += character_string[i] + str(count)
count=1
compressed_string += character_string[-1] + str(count)
if len(compressed_string) >= len(character_string):
return character_string
return compressed_string
user_string= input("hello user spam character: ")
x=rle(user_string)
print(x)
Upvotes: 1
Views: 132
Reputation: 27404
You only need to append the count if it's not 1. Therefore:
def rle(s: str) -> str:
p, *r = s
count = 1
result = p
for c in r:
if c == p:
count += 1
else:
if count > 1:
result = f"{result}{count}{c}"
count = 1
else:
result += c
p = c
return result if count < 2 else f"{result}{count}"
print(rle("abbbbbc"))
print(rle("abbbbbcc"))
Output:
ab5c
ab5c2
Note:
It is assumed that the string to be encoded does not contain any digits. If it did, this encoding technique would generate a string that could not be decoded due to obvious ambiguities.
Addendum:
Here's a decoder:
def rld(s: str) -> str:
result, *r = s
count = 0
for c in r:
if c.isdecimal():
count = count * 10 + int(c)
else:
if count > 0:
result += (result[-1] * (count-1))
count = 0
result += c
return result if count == 0 else result + (result[-1] * (count-1))
Upvotes: 0
Reputation: 358
This code works for your run-length encoding compression problem. The algorithm described by you need to skip the occurance number of the character if it was one.You can achieve this by adding a if and else statement,which need to add the count if it was greater than one. otherwise it shouldn't.
def rel(string):
count, compressed_str = 1, ''
for index in range(len(string) - 1):
if string[index] == string[index + 1]:
count += 1
else:
#Checking for the continous appreance of character is not 1
if count > 1:
compressed_str += string[index] + str(count)
else:
compressed_str += string[index]
count = 1
# Handling the last character
if count > 1:
compressed_str += string[-1] + str(count)
else:
compressed_str += string[-1]
return compressed_str
print(rel('aabbbbbcccdccc'))
Upvotes: 0
Reputation: 169398
To show off the standard library, this could be done with itertools.groupby
:
import itertools
def rle_parts(text):
for ch, grouper in itertools.groupby(text):
count = sum(1 for _ in grouper)
yield f"{ch}{count}" if count > 1 else ch
def rle(text):
return "".join(rle_parts(text))
print(rle("abbbbbcc"))
Upvotes: 0
Reputation: 179717
For a rather more succinct way to implement your RLE algorithm using regular expressions, with a bonus decompressor:
import re
def rle(x):
return re.sub(r"([a-z])\1+", lambda m: f"{m[1]}{len(m[0])}", x)
def unrle(x):
return re.sub(r"([a-z])(\d+)", lambda m: m[1] * int(m[2]), x)
x = rle('aabbbzbbccpppppxxypppppkk')
print(x)
print(unrle(x))
The way this works is to look for a single letter followed by one or more repetitions of that letter. The replacement function then replaces that run with the RLE representation. The reverse is easy: we look for letters followed by any number of digits, and replace that with the correct run of letters. This even handles arbitrarily long numbers (e.g. a123
decompresses to 123 a
s).
The code above assumes you’re only compressing strings of lowercase letters; if not, adjust the character class [a-z]
as needed.
Upvotes: 2
Reputation: 2022
You need to append the count only when it's greater than 1.
def rle(character_string):
compressed_string = ""
count = 1
for i in range(len(character_string) - 1):
if character_string[i] == character_string[i + 1]:
count += 1
else:
compressed_string += character_string[i]
# Append the count only if it's greater than 1
if count > 1:
compressed_string += str(count)
count = 1
compressed_string += character_string[-1]
# Same thing for the last character
if count > 1:
compressed_string += str(count)
if len(compressed_string) >= len(character_string):
return character_string
return compressed_string
user_string = input("hello user spam character: ")
x = rle(user_string)
print(x)
Upvotes: 4