Ferruccio Balestreri
Ferruccio Balestreri

Reputation: 25

Why do I get a wrong XOR output

I just started the cryptopals.com challenge and I'm already stuck at the second problem.. For some reason my output is wrong by only one character instead of 7 I get a 3 as the first number of my XOR operation.

Could you help me find the mistake in my code:

def XORfunction(input_1, input_2):
    bin_input_1 = hexToBinary(input_1)
    bin_input_2 = hexToBinary(input_2)

    # check if length of strings is the same for XOR compare or add "0" to the end
    if len(bin_input_1) != len(bin_input_2):

        if len(bin_input_1) > len(bin_input_2):
            number_length = len(bin_input_1)
            temp_input = list(bin_input_2)
            for x in xrange(0, number_length - len(bin_input_2)):
                temp_input.insert(0, "0")
            bin_input_2 = "".join(temp_input)

        if len(bin_input_1) < len(bin_input_2):
            number_length = len(bin_input_2)
            temp_input = list(bin_input_1)
            for x in xrange(0, number_length - len(bin_input_1)):
               temp_input.insert(0, "0")
            bin_input_1 = "".join(temp_input)
    solution = []
    # XOR is like a sum so if el1+el2 == 1 output is 1 else output is 0
    for x in xrange(0, len(bin_input_1) - 1):
        # the array is iterated from [0] to len(bin_input_1)-1 so the elements are calculated from last to first
        current_compare = int(bin_input_1[x]) + int(bin_input_2[x])
        if current_compare == 1:
            solution.insert(-1, "1")
        else:
            solution.insert(-1, "0")
    return dec_to_hex(int("".join(solution), 2))


# the final solution has to be converted from decimal to hexadecimal
def dec_to_hex(value):
    dictionary_hex = "0123456789abcdef"
    solution = []
    while value != 0:
        solution.insert(0, dictionary_hex[value % 16])
        value = value / 16
    return "".join(solution)


# Hex is converted to a binary string to make comparisons easy as the digits become easy to select as an array of chars
def hexToBinary(text):
    # first Hex is converted to decimal, then to binary (that needs to be sliced for a clean output), lastly it becomes a string
    return str(bin(int(text, base=16))[2:])


print XORfunction("1c0111001f010100061a024b53535009181c", "686974207468652062756c6c277320657965")

# expected output: 746865206b696420646f6e277420706c6179
# my output:       346865206b696420646f6e277420706c6179

This is my first time posting, so any tip on formatting/on the code is welcome.

PS: I know I should be using libraries, but I want to figure out what is my mistake first

Upvotes: 1

Views: 1056

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1122102

You have several issues:

  • Your hexToBinary() function doesn't produce padded binary. bin() will not return 8 bits per byte; leading zeros are not included! As such, you are missing 000 from the start of the first string, 0 from the other. You try to compensate for this in your XORfunction function, but that only adds back 2 zeros, not 3.

    You could use the str.format() method instead to ensure that you get the right number of bits, zero padded:

    return '{:0{}b}'.format(int(text, base=16), len(text) * 4)
    

    The b formatting instruction tells str.format() to produce the binary representation of a number. 0 before the width means to zero-pad the number to the required length, and the {} placeholder for the length is taken from the len(text) * 4 value, so 4 bits per hex character in the input.

  • You are inserting the solution bits before the last element in the list. This leaves the very first bit right at the end of your solution, with everything else inserted before it:

    >>> demo = []
    >>> demo.insert(-1, 'foo')  # inserting into an empty list
    >>> demo
    ['foo']
    >>> demo.insert(-1, 'bar')  # inserting before the last element
    >>> demo
    ['bar', 'foo']
    >>> demo.insert(-1, 'spam') # inserting before the last element
    ['bar', 'spam', 'foo']
    

    Just use appending to add elements to the end of a list:

    solution.append("1")
    

    and

    solution.append("0")
    
  • You skip processing the last bit. You need to iterate all the way to len(bin_input_1):

    for x in xrange(len(bin_input_1)):
    

With those 3 fixes applied, your code works and produces the expected output.

Your code is indeed re-inventing standard wheels in the Python language and standard library:

  • Rather than manually XOR every bit, use the ^ operator to work on a whole byte at a time.
  • Use the binascii.hexlify() and binascii.unhexlify() functions to convert between hexadecimal and bytes.
  • In Python 2, use the bytearray() type to work with binary data as a sequence of integers, this is much easier to apply XOR operations to.
  • Use the zip() function to iterate over two sequences together, pairing up elements from both.

Put together as a Python 2 solution:

from binascii import hexlify, unhexlify

def XORfunction(input_1, input_2):
    input_1 = bytearray(unhexlify(input_1))
    input_2 = bytearray(unhexlify(input_2))
    return hexlify(bytearray(
        a ^ b for a, b in zip(input_1, input_2)))

In Python 3, you can simply omit the first two bytearray() calls, and replace the last with bytes().

Upvotes: 2

Related Questions