Jakub
Jakub

Reputation: 13860

Base64 and utf8 / National characters encoding

I want to use polish national characters to Base64 encoding. For example:

"zażółć gęślą jaźń"

Should be:

emEmIzM4MDvzJiMzMjI7JiMyNjM7IGcmIzI4MTsmIzM0NztsJiMyNjE7IGphJiMzNzg7JiMzMjQ7

But after implement this solution:

-(NSString *)Base64Encode:(NSData *)data{
    //Point to start of the data and set buffer sizes
    int inLength = [data length];
    int outLength = ((((inLength * 4)/3)/4)*4) + (((inLength * 4)/3)%4 ? 4 : 0);
    const char *inputBuffer = [data bytes];
    char *outputBuffer = malloc(outLength);
    outputBuffer[outLength] = 0;

    //64 digit code
    static char Encode[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";

    //start the count
    int cycle = 0;
    int inpos = 0;
    int outpos = 0;
    char temp;

    //Pad the last to bytes, the outbuffer must always be a multiple of 4
    outputBuffer[outLength-1] = '=';
    outputBuffer[outLength-2] = '=';


    while (inpos < inLength){
        switch (cycle) {
            case 0:
                outputBuffer[outpos++] = Encode[(inputBuffer[inpos]&0xFC)>>2];
                cycle = 1;
                break;
            case 1:
                temp = (inputBuffer[inpos++]&0x03)<<4;
                outputBuffer[outpos] = Encode[temp];
                cycle = 2;
                break;
            case 2:
                outputBuffer[outpos++] = Encode[temp|(inputBuffer[inpos]&0xF0)>> 4];
                temp = (inputBuffer[inpos++]&0x0F)<<2;
                outputBuffer[outpos] = Encode[temp];
                cycle = 3;                  
                break;
            case 3:
                outputBuffer[outpos++] = Encode[temp|(inputBuffer[inpos]&0xC0)>>6];
                cycle = 4;
                break;
            case 4:
                outputBuffer[outpos++] = Encode[inputBuffer[inpos++]&0x3f];
                cycle = 0;
                break;                          
            default:
                cycle = 0;
                break;
        }
    }
    NSString *pictemp = [NSString stringWithUTF8String:outputBuffer];
    free(outputBuffer); 
    return pictemp;
}

of course i get something slightly different:

emHFvMOzxYLEhyBnxJnFm2zEhSBqYcW6xYQ

Witch is return to me (by online decoder):

zażółć gęślą jaźń

I'm calling it that way:

NSString* str= _@"zażółć gęślą jaźń";
NSData* data=[str dataUsingEncoding:NSUTF8StringEncoding];

NSString * encodeString = [[[NSString alloc] init] autorelease];

encodeString = [self Base64Encode:data];

Upvotes: 0

Views: 1283

Answers (1)

Wevah
Wevah

Reputation: 28242

The online decoder is on an ISO-8859-1 page and not UTF-8. If you force UTF-8, it works.

Also, the difference in the encoded version might be because of composed vs. decomposed characters (not sure).

Upvotes: 1

Related Questions