Bedbug exterminator
Bedbug exterminator

Reputation: 27

how to get rid of count if there is only a single character?

In my program, whenever I have a single character, it will have a count of 1. For example, if I had abbbbbc, I would get back a1b5c1. I don't want single characters to have a count. I like the program to read as ab5c. program below:

def rle(character_string):
  compressed_string=""
  count = 1 
  
  for i in range(len(character_string)-1):
    if character_string[i]== character_string[i+1]:  
      count+=1
    else:
     compressed_string += character_string[i] + str(count) 
     count=1
  
  compressed_string += character_string[-1] + str(count) 

  if len(compressed_string) >= len(character_string):
    return character_string
                                
  return compressed_string 


user_string= input("hello user spam character: ")
x=rle(user_string)

print(x)

Upvotes: 1

Views: 132

Answers (5)

Adon Bilivit
Adon Bilivit

Reputation: 27404

You only need to append the count if it's not 1. Therefore:

def rle(s: str) -> str:
    p, *r = s
    count = 1
    result = p
    for c in r:
        if c == p:
            count += 1
        else:
            if count > 1:
                result = f"{result}{count}{c}"
                count = 1
            else:
                result += c
            p = c
    return result if count < 2 else f"{result}{count}"

print(rle("abbbbbc"))
print(rle("abbbbbcc"))

Output:

ab5c
ab5c2

Note:

It is assumed that the string to be encoded does not contain any digits. If it did, this encoding technique would generate a string that could not be decoded due to obvious ambiguities.

Addendum:

Here's a decoder:

def rld(s: str) -> str:
    result, *r = s
    count = 0
    for c in r:
        if c.isdecimal():
            count = count * 10 + int(c)
        else:
            if count > 0:
                result += (result[-1] * (count-1))
                count = 0
            result += c
    return result if count == 0 else result + (result[-1] * (count-1))

Upvotes: 0

Nalan PandiKumar
Nalan PandiKumar

Reputation: 358

This code works for your run-length encoding compression problem. The algorithm described by you need to skip the occurance number of the character if it was one.You can achieve this by adding a if and else statement,which need to add the count if it was greater than one. otherwise it shouldn't.


def rel(string):
    count, compressed_str = 1, ''

    for index in range(len(string) - 1):
        if string[index] == string[index + 1]:
            count += 1
        else:
            #Checking for the continous appreance of character is not 1
            if count > 1:
                compressed_str += string[index] + str(count)
            else:
                compressed_str += string[index]
            count = 1
            
    # Handling the last character
    if count > 1:
        compressed_str += string[-1] + str(count)
    else:
        compressed_str += string[-1]

    return compressed_str

print(rel('aabbbbbcccdccc'))  

Upvotes: 0

AKX
AKX

Reputation: 169398

To show off the standard library, this could be done with itertools.groupby:

import itertools


def rle_parts(text):
    for ch, grouper in itertools.groupby(text):
        count = sum(1 for _ in grouper)
        yield f"{ch}{count}" if count > 1 else ch


def rle(text):
    return "".join(rle_parts(text))


print(rle("abbbbbcc"))

Upvotes: 0

nneonneo
nneonneo

Reputation: 179717

For a rather more succinct way to implement your RLE algorithm using regular expressions, with a bonus decompressor:

import re

def rle(x):
    return re.sub(r"([a-z])\1+", lambda m: f"{m[1]}{len(m[0])}", x)

def unrle(x):
    return re.sub(r"([a-z])(\d+)", lambda m: m[1] * int(m[2]), x)

x = rle('aabbbzbbccpppppxxypppppkk')
print(x)
print(unrle(x))

The way this works is to look for a single letter followed by one or more repetitions of that letter. The replacement function then replaces that run with the RLE representation. The reverse is easy: we look for letters followed by any number of digits, and replace that with the correct run of letters. This even handles arbitrarily long numbers (e.g. a123 decompresses to 123 as).

The code above assumes you’re only compressing strings of lowercase letters; if not, adjust the character class [a-z] as needed.

Upvotes: 2

elmiomar
elmiomar

Reputation: 2022

You need to append the count only when it's greater than 1.

def rle(character_string):
    compressed_string = ""
    count = 1

    for i in range(len(character_string) - 1):
        if character_string[i] == character_string[i + 1]:
            count += 1
        else:
            compressed_string += character_string[i]
            # Append the count only if it's greater than 1
            if count > 1:
                compressed_string += str(count)
            count = 1

    compressed_string += character_string[-1]
    # Same thing for the last character
    if count > 1:
        compressed_string += str(count)

    if len(compressed_string) >= len(character_string):
        return character_string

    return compressed_string

user_string = input("hello user spam character: ")
x = rle(user_string)

print(x)

Upvotes: 4

Related Questions