Kevin Rave
Kevin Rave

Reputation: 14456

Bits, Bytes and numbers. Shrink the size of the byte

It may be a very basic low level architecture questions. I am trying to get my head around it. Please correct if my understanding is wrong, as well.

Word = 64 bit, 32 bit, etc. This is a number of bits computer can read at a time.

Questions:

1.) Would this mean, we can send, 4 numbers (of a 8 bits/byte length each) for 32 bit? Or combination of 8 bit (byte), 32 bit (4 bytes), etc numbers at one time?

2.) If we need to send only 8 bit number, then how does it form a word? Only first byte is filled and rest all bytes are padded with 0s or last byte gets filled while rest of the bytes are padded with 0s? Or I saw somewhere like first byte has information as to how the rest of the bytes are filled. Does that apply here? For example, UTF-8. Here, ASCII is 1 byte, and some other chars take up to 4 bytes. So when we send one char, we send all 4 bytes together, but fill the bytes as required for the char and rest of the bytes 0s?

3.) Now to represent 8 digit number, we would need 27 bits (remember famous question, sorting 1 million 8 digit number with just 1 MB RAM). Can we exactly use 27 bits, which is 32 bits (4 bytes) - 5 bits? and use those 5 digits for something else?

Appreciate your answers!

Upvotes: 1

Views: 2029

Answers (2)

ctype.h
ctype.h

Reputation: 1480

1- Yes, four 8-bit integers can fit in a 32-bit integer. This can be done using bitwise operations, for example (using C operators):

((a & 255) << 24) | ((b & 255) << 16) | ((c & 255) << 8) | (d & 255)

This example uses C operators, but they are also used for the same purpose in several other languages (see below - a complete, compilable version of this example in C). You may want to look up the bitwise operators AND (&), OR (|), and Left Shift (<<);

2- Unused bits are generally 0. The first byte is sometimes used to represent the type of encoding (Look up "Magic Numbers"), but this is implementation dependent. Sometimes it is a different number of bits.

3- Groups of 8-digit numbers can be compressed to use only 27 bits each. This is very similar to the example, except the number of bits and size of the data are different. To do this, you will need 864-bit groups, i.e. 27 32-bit integers to store 32 27-bit numbers. This would be more complex than the example, but it would use the same principles.


Complete, compilable example in C:

#include <stdio.h>

/*Compresses four integers containing one byte of data in the least
 *significant byte into a single 32-bit integer*/
__int32 compress(int a, int b, int c, int d){
   __int32 compressed = ((a & 255) << 24) | ((b & 255) << 16) |
      ((c & 255) << 8) | (d & 255);
   return compressed;
}

/*Test the compress() function and print the resuts*/
int main(){
   printf("%x\n", (unsigned)compress(255, 0, 255, 0));
   printf("%x\n", (unsigned)compress(192, 168, 0, 255));
   printf("%x\n", (unsigned)compress(84, 94, 255, 2));
   return 0;
}

Upvotes: 1

Aman
Aman

Reputation: 9035

I think that clarification on 2 points is required here :
1. Memory addressing.
2. Word

Memories can be addressed in 2 ways, they are generally either byte addressable or word addressable. Byte addressable memory means that each byte is given a separate address.
a -> 0th byte
b -> 1st byte
Word addressable memories are those in which each group of bytes that is as wide as the word gets an address. Eg if the Word Length is 32 bits :
a->0th byte
b->4th byte
And so on.

Word
I would say that a word defines the maximum number of bits a processor can handle at a time. For 8086, for eg, it's 16. It is usually the largest number on which the arithmetic can be performed by the processor. Continuing the example , 8086 can perform operations on 16 bit numbers at a time.

Now i'll try and answer the questions :

1.) Would this mean, we can send, 4 numbers (of a 8 bits/byte length each) for 32 bit? Or combination of 8 bit (byte), 32 bit (4 bytes), etc numbers at one time?

You can always define your own interpretation for a bunch of bits.

For eg, If it is byte addressable, we can treat every byte individually and thus , we can write code at assemble level that treats each byte as a separate 8 bit number. If it is not, you can use bit operations to extract individual bytes out.
The point is you can represent 4 8 bit numbers in 32 bits.

2) Mostly, leftover significant bits are stuffed with 0s ( for unsigned numbers)

3.) Now to represent 8 digit number, we would need 27 bits (remember famous question, sorting 1 million 8 digit number with just 1 MB RAM). Can we exactly use 27 bits, which is 32 bits (4 bytes) - 5 bits? and use those 5 digits for something else?

Yes, you can do this also. But you know the great space-time tradeoff. You sure save 5 bits, per number. But you'll need to use bit operations and all the really cool but hard to read stuff. Shooting up time and making code more complex.
But i don't think you'll ever come across a situation where you need such level of saving, unless you are coding for a very constrained system. (embedded etc)

Upvotes: 0

Related Questions