jazzer97
jazzer97

Reputation: 157

String slicing in python issue

I'm writing a simple program to decode a binary string given by:

bin_str = "101100001101100001"

At the start, if the first character is represented by "1", then the next eight characters are decoded which would be "01100001" where i pass "01100001" into the function to obtain it's ascii representation.

def convert_ascii(binary):
    c = chr(int(binary, 2))
    return c

Passing in "01100001" into the above function would yield "a" which is the first character decoded. Moving on, the next character at index 9 is also represented by "1" hence the next eight characters will also be decoded which is "01100001". Passing in to the above would also yield "a".

lst = []
fixed_length = 8
i = 0
while i < len(bin_str):
    if binary[i] == "1":
        fl_bin = binary[i+1:fixed_length+1] #issue here
        ascii_rep = convert_ascii(fl_bin)
        lst.append(ascii_rep)
        i+=fixed_length+1

The problem I'm facing is slicing up the particular string of length 8 which is "01100001" from the original bin_str where i tried slicing by [i+1:fixed_length+1] but on the second phase, the fl_bin became "" instead of the next "01100001".

Would appreciate some help on this.

Upvotes: 0

Views: 56

Answers (2)

vash_the_stampede
vash_the_stampede

Reputation: 4606

Using iter and next to cycle through if next produces a 1 then create a sublist of the next 8 items, append that to the main list and repeat until the generator is exhausted.

bin_str = "101100001101100001" 
a = iter(bin_str) 
lst = []

while True:
    try:
        b = next(a)
        z = []
        if b == '1':
            for i in range(8):
                z.append(next(a))
            lst.append(''.join(z))
    except StopIteration:
        break

print(lst)
# ['01100001', '01100001']

Upvotes: 0

kindall
kindall

Reputation: 184200

A nice way to do this is to create a regular expression that matches 1 followed by exactly eight 1 or 0 characters, and then use re.findall() to find all non-overlapping occurrences of this pattern in the string. By using a non-capturing group, you can even keep the initial 1 digit from being included in the results (although if you didn't do this, it's trivial to slice off that digit).

import re
reg_ex = "(?:1)([01]{8})"

bin_str = "101100001101100001"
ascii_rep = "".join(chr(int(byte, 2)) for byte in re.findall(reg_ex, bin_str))

As a bonus, this allows the groups in teh source string to be separated (by spaces, or words, or anything that's not a 1 followed by 8 0s or 1s) for easier reading.

Upvotes: 1

Related Questions