Reputation: 339
TLDR: What is the edge case that I'm missing, or is there a mistake in my algorithm for converting a Base64 string to a Hex string?
I recently decided to try the Matasano Crypto Challenges, however for whatever reason, I decided to try and write the first challenge without using a library for converting between Hex and Base64 strings.
I've managed to get the Hex to Base64 conversion working, but as you can see from the output, there is a slight anomaly when I try to convert Base64 back into Hex (eg. compare the last four values of the Base64 to Hex output).
Hex To Base64:
Should Print: SSdtIGtpbGxpbmcgeW91ciBicmFpbiBsaWtlIGEgcG9pc29ub3VzIG11c2hyb29t Actually Prints: SSdtIGtpbGxpbmcgeW91ciBicmFpbiBsaWtlIGEgcG9pc29ub3VzIG11c2hyb29tBase64 to Hex:
Should Print: 49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6d
Actually Prints: 49276d206b696c6c696e6720796e717220627261696e206c696b65206120706e69732e6e6f3573206c717328726f2e6d
I was using https://conv.darkbyte.ru/ to check some of my values, and assuming the code on that site is correct, it seems my problem is to do with getting the Base10 representation from Base64, and not Base10 to Hex:
Decimal Equivalent
My Output:
73, 39, 109, 32, 107, 105, 108, 108, 105, 110, 103, 32, 121, 110, 113, 114, 32, 98, 114, 97, 105, 110, 32, 108, 105, 107, 101, 32, 97, 32, 112, 110, 105, 115, 46, 110, 111, 53, 115, 32, 108, 113, 115, 40, 114, 111, 46, 109Site’s Output:
73, 39, 109, 32, 107, 105, 108, 108, 105, 110, 103, 32, 121, 111, 117, 114, 32, 98, 114, 97, 105, 110, 32, 108, 105, 107, 101, 32, 97, 32, 112, 111, 105, 115, 111, 110, 111, 117, 115, 32, 109, 117, 115, 104, 114, 111, 111, 109
It seems that all of the values with errors are clustered around 40-60 and 100-120, but I'm not sure where exactly to go from there. I'm guessing there's some sort of edge case I'm note handling, but I'm not sure what that would be.
Relevant code:
private static final Character[] base64Order = new Character[] { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J',
'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e',
'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z',
'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '+', '/', };
private static final Character[] hexOrder = new Character[] { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a',
'b', 'c', 'd', 'e', 'f' };
public static String base64ToHex(String base64) throws Exception {
if (base64.length() % 4 != 0 || base64.contains("[^a-zA-Z0-9\\+/]"))
throw new Exception("InputNotBase64");
else {
int charValue = 0;
int index = 0;
String hex = "";
BitSet bits = new BitSet();
for (int i = 0; i < base64.length(); i++) {
charValue = base64.charAt(i);
// get actual value from ASCII table
if (charValue > 64 && charValue < 91)
charValue -= 65;
if (charValue > 96 && charValue < 123)
charValue -= 71;
/// loop that adds to the BitSet reads right-to-left, so reverse
// the bits and then shift
charValue = Integer.reverse(charValue << 24) & 0xff;
charValue >>= 2;
// append binary values to the BitSet
while (charValue != 0L) {
if (charValue % 2 != 0) {
bits.set(index);
}
index++;
charValue >>= 1;
}
// account for trailing 0s
while (index % 6 != 0) {
index++;
}
}
// read 8-bit integer value for hex-value lookup
String temp;
int remainder;
for (int i = 0; i < index; i++) {
charValue = (charValue | (bits.get(i) ? 1 : 0));
if ((i + 1) % 8 == 0) {
temp = "";
while (charValue != 0L) {
remainder = charValue % 16;
temp = hexOrder[remainder] + temp;
charValue /= 16;
}
hex += temp;
}
charValue <<= 1;
}
return hex;
}
}
Upvotes: 4
Views: 861
Reputation: 392
You forgot handle following chars in your code: '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '+', '/' If you replace following code
if (charValue > 64 && charValue < 91)
charValue -= 65;
if (charValue > 96 && charValue < 123)
charValue -= 71;
by
charValue = getPositionInBase64(charValue);
where
public static int getPositionInBase64(int n)
{
for (int p = 0; p < base64Order.length; p++)
{
if (n == base64Order[p])
{
return p;
}
}
return -1;
}
all works OK
Additionally code is more readable when you use chars instead of magic numbers
if (charValue >= 'A' && charValue <= 'Z')
charValue -= 'A';
...
In this case finding problems is easier
Because you asked I'm presenting possible improvments to speed up calculations.
Prepare following table and init it once
// index = character, value = index of character from base64Order
private static final int[] base64ToInt = new int[128];
public static void initBase64ToIntTable()
{
for (int i = 0; i < base64Order.length; i++)
{
base64ToInt[base64Order[i]] = i;
}
}
Now you can replace your if/else chain by simply operation
charValue = base64ToInt[base64.charAt(i)];
Using this I wrote method a few times faster then your
private static String intToHex(int n)
{
return String.valueOf(new char[] { hexOrder[n/16], hexOrder[n%16] });
}
public static String base64ToHexVer2(String base64) throws Exception
{
StringBuilder hex = new StringBuilder(base64.length()*3/4); //capacity could be 3/4 of base64 string length
if (base64.length() % 4 != 0 || base64.contains("[^a-zA-Z0-9\\+/]"))
{
throw new Exception("InputNotBase64");
}
else
{
for (int i = 0; i < base64.length(); i += 4)
{
int n0 = base64ToInt[base64.charAt(i)];
int n1 = base64ToInt[base64.charAt(i+1)];
int n2 = base64ToInt[base64.charAt(i+2)];
int n3 = base64ToInt[base64.charAt(i+3)];
// in descriptions I treat all 64 base chars as 6 bit
// all 6 bites from 0 and 1st 2 from 1st (00000011 ........ ........)
hex.append(intToHex(n0*4 + n1/16));
// last 4 bites from 1st and first 4 from 2nd (........ 11112222 ........)
hex.append(intToHex((n1%16)*16 + n2/4));
// last 2 bites from 2nd and all from 3rd (........ ........ 22333333)
hex.append(intToHex((n2%4)*64 + n3));
}
}
return hex.toString();
}
I susspect that this code is faster mainly because simplier translation to hex. If you want to and need it you can test it.
To test speed you can use following construction
String b64 = "SSdtIGtpbGxpbmcgeW91ciBicmFpbiBsaWtlIGEgcG9pc29ub3VzIG11c2hyb29t";
try
{
Base64ToHex.initBase64ToIntTable();
System.out.println(Base64ToHex.base64ToHex(b64));
System.out.println(Base64ToHex.base64ToHexVer2(b64));
int howManyIterations = 100000;
Date start, stop;
long period;
start = new Date();
for (int i = 0; i < howManyIterations; i++)
{
Base64ToHex.base64ToHexVer2(b64);
}
stop = new Date();
period = stop.getTime() - start.getTime();
System.out.println("Ver2 taken " + period + " ms");
start = new Date();
for (int i = 0; i < howManyIterations; i++)
{
Base64ToHex.base64ToHex(b64);
}
stop = new Date();
period = stop.getTime() - start.getTime();
System.out.println("Ver1 taken " + period + " ms");
}
catch (Exception ex)
{
}
Example result is
49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6d
49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6d
Ver2 taken 300 ms
Ver1 taken 2080 ms
But it is only approximation. Results can be a little bit different when you check Ver1 first and Ver2 as second. Additionally results can be different for diferent javas (6th, 7th, 8th) and for different settings used to start java
Upvotes: 1