Just A Mathematician
Just A Mathematician

Reputation: 240

Struggling to convert a number to base 64 in Python without strings

So I am trying to program (in Python 3 without strings) this cool project that I found.

Return the 6-character string representation of the 36-bit number n as a base-64 number in reverse order where the order of the 64 numerals is:

0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz-+

For example,

encode(0) → '000000'
encode(9876543210) → 'gR1iC9'
encode(68719476735) → '++++++'

What I have so far is:

def encode(n):
  SYM = {'0': 0,
         '1': 1,
         '2': 2,
         '3': 3,
         '4': 4,
         '5': 5,
         '6': 6,
         '7': 7,
         '8': 8,
         '9': 9,
         'A': 10,
         'B': 11,
         'C': 12,
         'D': 13,
         'E': 14,
         'F': 15,
         'G': 16,
         'H': 17,
         'I': 18,
         'J': 19,
         'K': 20,
         'L': 21,
         'M': 22,
         'N': 23,
         'O': 24,
         'P': 25,
         'Q': 26,
         'R': 27,
         'S': 28,
         'T': 29,
         'U': 30,
         'V': 31,
         'W': 32,
         'X': 33,
         'Y': 34,
         'Z': 35,
         'a': 36,
         'b': 37,
         'c': 38,
         'd': 39,
         'e': 40,
         'f': 41,
         'g': 42,
         'h': 43,
         'i': 44,
         'j': 45,
         'k': 46,
         'l': 47,
         'm': 48,
         'n': 49,
         'o': 50,
         'p': 51,
         'q': 52,
         'r': 53,
         's': 54,
         't': 55,
         'u': 56,
         'v': 57,
         'w': 58,
         'x': 59,
         'y': 60,
         'z': 61,
         '-': 62,
         '+': 63,}

But now I am not sure what to do next. I do not want to use strings and concatenation etc, I would like to do this using modulus and the standard number theory + for/while/else methods.

My thoughts were to define

r1 = n % 63
r2 = r1 % 63
r3 = r2 % 63
r4 = r3 % 63
r5 = r4 % 63
r6 = r5 % 63

But I'm not sure what to do from there.

For example, how should I convert n to base 64? At the end, to reverse the digits after I have found the new representation, I am thinking that I will just mod each power of 10 to isolate each individual digit and then concatenate them in reverse order. However, I'm not sure how to program this in Python, since I'm relatively new to the language. All help is appreciated, thank you.

Upvotes: 2

Views: 1657

Answers (1)

PM 2Ring
PM 2Ring

Reputation: 55469

Here's some code that does what you want. The get_digit function uses a bunch of if... elif tests to convert an integer d in 0 <= d < 64 into its corresponding character number, and then uses the standard chr function to convert that number to the actual character. The encode function does the actual remainder calculations, calling get_digit to do the character conversion, and saving the results into the out list. We append that list with '0' characters to make its length 6.

def get_digit(d):
    ''' Convert a base 64 digit to the desired character '''
    if 0 <= d <= 9:
        # 0 - 9
        c = 48 + d
    elif 10 <= d <= 35:
        # A - Z
        c = 55 + d
    elif 36 <= d <= 61:
        # a - z
        c = 61 + d
    elif d == 62:
        # -
        c = 45
    elif d == 63:
        # +
        c = 43
    else:
        # We should never get here
        raise ValueError('Invalid digit for base 64: ' + str(d)) 
    return chr(c)

# Test `digit`
print(''.join([get_digit(d) for d in range(64)]))

def encode(n):
    ''' Convert integer n to base 64 '''
    out = []
    while n:
        n, r = n // 64, n % 64
        out.append(get_digit(r))
    while len(out) < 6:
        out.append('0')
    return ''.join(out)

# Test `encode`
for i in (0, 9876543210, 68719476735):
    print(i, encode(i))

output

0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz-+
0 000000
9876543210 gR1iC9
68719476735 ++++++

Since we're working with a base that's a power of 2, an alternative to

n, r = n // 64, n % 64

is to use bitwise operations

n, r = n >> 64, n & 63

which are slightly faster, but I guess that doesn't make much difference here, and the previous code is more readable. OTOH, it can be useful to understand why the bitwise version produces the correct result.

Upvotes: 2

Related Questions