mateo
mateo

Reputation: 37

Why the size of this binary files are equal although they should not?

By writing simple python script, I encoutered a weird problem: Two files with a different content have same size.

So, I have a two same list of some binary data, one in string, one in int:

char_list = '10101010'
int_list = [1, 0, 1, 0, 1, 0, 1, 0]

Then, I convert lists to bytearray:

bytes_from_chars = bytearray(char_list, "ascii")
bytes_from_ints = bytearray(int_list)

Printing this out, give me this result:

bytearray(b'10101010')
bytearray(b'\x01\x00\x01\x00\x01\x00\x01\x00')

but, this is ok.

Writing this data to disk:

with open("from_chars.hex", "wb") as f:
    f.write(bytes_from_chars)

with open("from_ints.hex", "wb") as f:
    f.write(bytes_from_ints)

And the size of files are same, but files contains different data!

ls -l:

size of files

hexdump of files:

hexdump

And my question is, why the size of file are equal? As I now, to write value of 0 or 1 we need 1 bit, and to write hex value of 30 or 31 we need 5 bits (1 1110 and 1 1111)

Upvotes: 1

Views: 312

Answers (1)

Dinari
Dinari

Reputation: 2557

To write the value of 0 or 1 you do not need a single bit. How could you tell the difference between 3 = 11 or having two 1?

You are writing in both cases an array of 8 bytes, Just in the first case your using the whole byte to write the char.
Think of it as writing a word from the letters 0 and 1, the word 1 is 0000 0001 , Without the 0s in the start, you wont be able to tell what the word is.

Upvotes: 1

Related Questions