김철영
김철영

Reputation: 1

how to remove garbage character from image_to_base64 string

I encoded the cropped image as a base64 string

Most of them are encoded correctly, but I've noticed that some strings have garbage added at the end ex1) ~~~ PM3fhtGKOYZ/9k= ex2) ~~~ f8KKKAP//Z

And I also confirmed that if I remove the garbage value, it is a correct base64 string

I think it's because of the length of the allocated string, but I don't know exactly what the problem is and how to solve it, so I ask for help

this is my code

static char encoding_table[] = {'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H',
                            'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P',
                            'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X',
                            'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f',
                            'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n',
                            'o', 'p', 'q', 'r', 's', 't', 'u', 'v',
                            'w', 'x', 'y', 'z', '0', '1', '2', '3',
                            '4', '5', '6', '7', '8', '9', '+', '/'};
static char *decoding_table = NULL;
static int mod_table[] = {0, 2, 1};

void build_decoding_table() {

  decoding_table = (char*)malloc(256);

  for (int i = 0; i < 64; i++)
    decoding_table[(unsigned char) encoding_table[i]] = i;
}

char *base64_encode(const unsigned char *data, uint32_t input_length, size_t *output_length) {

  *output_length = 4 * ((input_length + 2) / 3);

  char *encoded_data = (char*)malloc(*output_length);
  if (encoded_data == NULL) return NULL;

  for (int i = 0, j = 0; i < input_length;) {

    uint32_t octet_a = i < input_length ? (unsigned char)data[i++] : 0;
    uint32_t octet_b = i < input_length ? (unsigned char)data[i++] : 0;
    uint32_t octet_c = i < input_length ? (unsigned char)data[i++] : 0;
 
    uint32_t triple = (octet_a << 0x10) + (octet_b << 0x08) + octet_c;

    encoded_data[j++] = encoding_table[(triple >> 3 * 6) & 0x3F];
    encoded_data[j++] = encoding_table[(triple >> 2 * 6) & 0x3F];
    encoded_data[j++] = encoding_table[(triple >> 1 * 6) & 0x3F];
    encoded_data[j++] = encoding_table[(triple >> 0 * 6) & 0x3F];

  }

  for (int i = 0; i < mod_table[input_length % 3]; i++)
    encoded_data[*output_length - 1 - i] = '=';

  return encoded_data;
}

static char *rt_b64(){
            char *lobi;
            size_t output_length;
            lobi = base64_encode(enc_jpeg_image->outBuffer, enc_jpeg_image->outLen, &output_length); // "enc_jpeg_image" is a structure that holds information about objects in the pipeline
            return lobi;
}

Upvotes: 0

Views: 186

Answers (1)

the busybee
the busybee

Reputation: 12600

You forgot to add the end-of-string marker '\0'.

Therefore your "string" has no defined end, and all string processing thereafter tries to read as long as it does not find a '\0'. Because after the allocated space other bytes are used, for example for memory management of malloc() or anything else, functions trying to handle your "string" are accessing memory out-of-bounds. Interpreted as characters, this looks like "garbage", as you call it.


Solve this issue by allocating one more character and marking the end of the string:

char *base64_encode(const unsigned char *data, size_t input_length, size_t *output_length) {
    *output_length = 4 * ((input_length + 2) / 3);

    char *encoded_data = (char*)malloc(*output_length + 1); /* <-- here */
    if (encoded_data == NULL) {
        return NULL;
    }

    for (size_t i = 0, j = 0; i < input_length; ) {
        size_t octet_a = i < input_length ? (unsigned char)data[i++] : 0;
        size_t octet_b = i < input_length ? (unsigned char)data[i++] : 0;
        size_t octet_c = i < input_length ? (unsigned char)data[i++] : 0;
 
        size_t triple = (octet_a << 0x10) + (octet_b << 0x08) + octet_c;

        encoded_data[j++] = encoding_table[(triple >> 3 * 6) & 0x3F];
        encoded_data[j++] = encoding_table[(triple >> 2 * 6) & 0x3F];
        encoded_data[j++] = encoding_table[(triple >> 1 * 6) & 0x3F];
        encoded_data[j++] = encoding_table[(triple >> 0 * 6) & 0x3F];
    }

    for (size_t i = 0; i < mod_table[input_length % 3]; i++) {
        encoded_data[*output_length - 1 - i] = '=';
    }

    encoded_data[*output_length] = '\0'; /* <-- here */

    return encoded_data;
}

Note: I have adjusted some types for less warnings.

Upvotes: 1

Related Questions