user1692261
user1692261

Reputation: 1217

byte array manipulation

I have a ~30 length byte arrays.
I am looking for a way to pass this data to several users with the following requirements:

1. It has to be short.. about 16 chars at max.
2. Only printable chars are acceptable (only digits or only letters will be even better).
3. It must to be a printable output (save to file or using socket is not what I want).
4. (Thanks to Tudor) I want to be able to decode it back to the original array

The arrays contain quite random data which makes this problem very difficult to crack.
I have tried many compression methods but with no luck so far.
After the compression I will probably encode the data to 64 base to make the output as short as possible (unless there is a better way to it)

The project is basically in java but if there is a solution in other language I would love to here about it.

thanks in advance

Upvotes: 3

Views: 1132

Answers (4)

Peter Lawrey
Peter Lawrey

Reputation: 533492

Truly random data will use as many bytes to encode as decode (if not more).

When you compress data, you exploit inherent non-random structure in the data to make something which is more random but smaller. This is why it is very hard to compress already compressed data.

In your case you appear to want to encode 30 * 8 bits or 240 bits into 16 * 6 bits or 96 bits. This means your data must not be very random to compress it at least 2.5 times. Compressing it this much every time would be very hard to do and you always have the possibility your compressed string will be larger than when you started. All you can do is to make this unlikely.

Unless your data does not have inherent compressibility you cannot use lossless compression (which is reversible) If lossy compression is an option, you still have to make assumptions about what information can be lost.


If you need to match a code with some information what you can do is to generate a random unique code and use this a key to some database. The benefit of using this approach is that the key can be as short as you like, provided you will never need more unique keys than you will generate, and you can associate as much information with the key as you like as well.

I believe this is your best option given the constraints you have.

Upvotes: 1

XistenZ
XistenZ

Reputation: 309

I do believe one char is either 1 or 2 bytes, which means: 16 chars = 16-32 bytes. One solution might be to define your own alphabet, if you can limit your chars to only be alphabetical, you only need 5 bits per byte (26 letters), so every 5th byte could store 8 letters. Convert your chars to your own specification, when you decode you just split every 5th bit.

Upvotes: 0

Cristiano Zambon
Cristiano Zambon

Reputation: 466

Sorry, I can't actually undestand really well your problem. You have 30 bytes that are binary, and you want to encode them in a printable string of a lenght lower than 16bytes? if yes, I would just say it is no possibile...but maybe I just didn't understand the question...

If the 30 bytes can have all the 255 possibile values, there is no way to compress them down to 16 bytes in all the possibile cases. That not a java issue, is just mathematics. If, on the contrary, your bytes just can have a subset of values, then maybe there is something you can do, depending on how many bits the subset require. To go down from 30 bytes to 16, if you want a random sequence of bytes to be stored inside your array, you can handle a maximum of 4 bit per byte that means 16 characters subset.

Upvotes: 0

Zuu
Zuu

Reputation: 1123

Your question (in case of later edits):

I have a ~30 length byte arrays. I am looking for a way to pass this data to several users with the following requirements:

  1. It has to be short.. about 16 chars at max.
  2. Only printable chars are acceptable (only digits or only letters will be even better).
  3. It must to be a printable output (save to file or using socket is not what I want).

The arrays contain quite random data which makes this problem very difficult to crack.

Answer: Given that you have a 30 byte array, with 'random' data in it. It is not possible to compress that into just 16 characters of only numbers and latin letters.

There is simply too much information in 30 bytes compared to 16 latin characters.

What you could do, however, is to use a much much larger alphabet, say all of the printable characters of unicode. I dont know exactly how many characters there are. But there are even some of the code points (and sequences of code points) that have the same visual presentation. All you need is 256*256 = 65536 different looking characters in total. That way you can encode two bytes into one character, and store up to 32 bytes in 16 letter string.

Notice however, that there is no compression involved in this technoque, it is merely a different encoding of the same raw data. Random data is not compressible.

Upvotes: 1

Related Questions