Niko
Niko

Reputation: 4448

Python3 write string as binary

For a Python 3 programming assignment I have to work with Huffman coding. It's simple enough to generate the correct codes which result in a long string of 0's and 1's.

Now my problem is actually writings this string of as binary and not as text. I attempted to do this:

result = "01010101 ... " #really long string of 0's and 1's
filewrt = open(output_file, "wb") #appending b to w should write as binary, should it not?
filewrt.write(result)
filewrt.close()

however I'm still geting a large text file of 0 and 1 characters. How do I fix this?

EDIT: It seems as if I just simply don't understand how to represent an arbitrary bit in Python 3.

Based on this SO question I devised this ugly monstrosity:

for char in result: 
    filewrt.write( bytes(int(char, 2)) )

Instead of getting anywhere close to working, it outputted a zero'd file that was twice as large as my input file. Can someone please explain to me how to represent binary arbitrarily? And in the context of creating a huffman tree, how do I go about concatinating or joining bits based on their leaf locations if I should not use a string to do so.

Upvotes: 1

Views: 1898

Answers (1)

saeedgnu
saeedgnu

Reputation: 4366

def intToTextBytes(n, stLen=0):
    bs = b''
    while n>0:
        bs = bytes([n & 0xff]) + bs
        n >>= 8
    return bs.rjust(stLen, b'\x00')


num = 0b01010101111111111111110000000000000011111111111111
bs = intToTextBytes(num)
print(bs)
open(output_file, "wb").write(bs)

EDIT: A more complicated, but faster (about 3 times) way:

from math import log, ceil
intToTextBytes = lambda n, stLen=0: bytes([
    (n >> (i<<3)) & 0xff for i in range(int(ceil(log(n, 256)))-1, -1, -1)
]).rjust(stLen, b'\x00')

Upvotes: 1

Related Questions