Xaisoft
Xaisoft

Reputation: 46641

Getting a string, int, etc in binary representation?

Is it possible to get strings, ints, etc in binary format? What I mean is that assume I have the string:

"Hello" and I want to store it in binary format, so assume "Hello" is

11110000110011001111111100000000 in binary (I know it not, I just typed something quickly).

Can I store the above binary not as a string, but in the actual format with the bits.

In addition to this, is it actually possible to store less than 8 bits. What I am getting at is if the letter A is the most frequent letter used in a text, can I use 1 bit to store it with regards to compression instead of building a binary tree.

Upvotes: 0

Views: 963

Answers (6)

Eric Lippert
Eric Lippert

Reputation: 660377

Is it possible to get strings, ints, etc in binary format?

Yes. There are several different methods for doing so. One common method is to make a MemoryStream out of an array of bytes, and then make a BinaryWriter on top of that memory stream, and then write ints, bools, chars, strings, whatever, to the BinaryWriter. That will fill the array with the bytes that represent the data you wrote. There are other ways to do this too.

Can I store the above binary not as a string, but in the actual format with the bits.

Sure, you can store an array of bytes.

is it actually possible to store less than 8 bits.

No. The smallest unit of storage in C# is a byte. However, there are classes that will let you treat an array of bytes as an array of bits. You should read about the BitArray class.

Upvotes: 3

James Jones
James Jones

Reputation: 8773

What I am getting at is if the letter A is the most frequent letter used in a text, can I use 1 bit to store it with regards to compression instead of building a binary tree.

The algorithm you're describing is known as Huffman coding. To relate to your example, if 'A' appears frequently in the data, then the algorithm will represent 'A' as simply 1. If 'B' also appears frequently (but less frequently than A), the algorithm usually would represent 'B' as 01. Then, the rest of the characters would be 00xxxxx... etc.

In essence, the algorithm performs statistical analysis on the data and generates a code that will give you the most compression.

Upvotes: 2

Guffa
Guffa

Reputation: 700592

What you are looking for is something like Huffman coding, it's used to represent more common values with a shorter bit pattern.

How you store the bit codes is still limited to whole bytes. There is no data type that uses less than a byte. The way that you store variable width bit values is to pack them end to end in a byte array. That way you have a stream of bit values, but that also means that you can only read the stream from start to end, there is no random access to the values like you have with the byte values in a byte array.

Upvotes: 2

pavium
pavium

Reputation: 15118

The string is actually stored in binary format, as are all strings.

The difference between a string and another data type is that when your program displays the string, it retrieves the binary and shows the corresponding (ASCII) characters.

If you were to store data in a compressed format, you would need to assign more than 1 bit per character. How else would you identify which character is the mose frequent?

If 1 represents an 'A', what does 0 mean? all the other characters?

Upvotes: 0

John Fisher
John Fisher

Reputation: 22717

You can use things like:

Convert.ToBytes(1);
ASCII.GetBytes("text");
Unicode.GetBytes("text");

Once you have the bytes, you can do all the bit twiddling you want. You would need an algorithm of some sort before we can give you much more useful information.

Upvotes: 1

gn22
gn22

Reputation: 2086

What encoding would you be assuming?

Upvotes: 2

Related Questions