Reputation: 37
By writing simple python script, I encoutered a weird problem: Two files with a different content have same size.
So, I have a two same list of some binary data, one in string, one in int:
char_list = '10101010'
int_list = [1, 0, 1, 0, 1, 0, 1, 0]
Then, I convert lists to bytearray:
bytes_from_chars = bytearray(char_list, "ascii")
bytes_from_ints = bytearray(int_list)
Printing this out, give me this result:
bytearray(b'10101010')
bytearray(b'\x01\x00\x01\x00\x01\x00\x01\x00')
but, this is ok.
Writing this data to disk:
with open("from_chars.hex", "wb") as f:
f.write(bytes_from_chars)
with open("from_ints.hex", "wb") as f:
f.write(bytes_from_ints)
And the size of files are same, but files contains different data!
ls -l:
hexdump of files:
And my question is, why the size of file are equal? As I now, to write value of 0 or 1 we need 1 bit, and to write hex value of 30 or 31 we need 5 bits (1 1110 and 1 1111)
Upvotes: 1
Views: 312
Reputation: 2557
To write the value of 0
or 1
you do not need a single bit. How could you tell the difference between 3 = 11
or having two 1
?
You are writing in both cases an array of 8 bytes, Just in the first case your using the whole byte to write the char.
Think of it as writing a word from the letters 0
and 1
, the word 1
is 0000 0001
, Without the 0s
in the start, you wont be able to tell what the word is.
Upvotes: 1