Reputation: 395
I am working on an application in C where I need to show Unicode UTF-8 characters. I am getting the values as a binary byte stream as 11010000 10100100 as character array which is the Unicode character "Ф".
I want to store and display the character. I tried to convert the binary to a hexadecimal character array. But printing with
void binaryToHex(char *bData) {
char hexaDecimal[MAX];
int temp;
long int i = 0, j = 0;
while (bData[i]) {
bData[i] = bData[i] - 48;
++i;
}
--i;
while (i - 2 >= 0) {
temp = bData[i - 3] * 8 + bData[i - 2] * 4 + bData[i - 1] * 2 + bData[i];
if (temp > 9)
hexaDecimal[j++] = temp + 55;
else
hexaDecimal[j++] = temp + 48;
i = i - 4;
}
if (i == 1)
hexaDecimal[j] = bData[i - 1] * 2 + bData[i] + 48;
else if (i == 0)
hexaDecimal[j] = bData[i] + 48;
else
--j;
printf("Equivalent hexadecimal value: ");
char hexVal[MAX];
// size_t len = j+1;
int k = 0;;
while (j >= 0) {
char *ch = (char*)hexaDecimal[j--];
if (j % 2 == 0) {
hexVal[k] = '\\';
k++;
hexVal[k] = 'x';
k++;
}
printf("\nkk++Length %d ...J= %d.. ", k, j);
hexVal[k] = ch;
k++;
printf("%c", ch);
}
printf("KKKK+=== %d", k);
hexVal[k] = NULL;
// printf("\nkk++Length %d",strlen(hexVal));
printf("\nMM+-+MM %s===\n ..>>>>", hexVal);
}
Only showing the value as \xD0\xA4. I did string manipulation for that. But when writing in the way
char s[]= "\xD0\xA4";
OR
char *s= "\xD0\xA4";
printf("\n %s",s);
producing the desired result that is printing the character "Ф". How can I get the correct string dynamically? Is there any library for this in C?
The code is from http://www.cquestions.com/2011/07/binary-to-hexadecimal-conversion-in.html.
Is there a way to print it from binary directly or from a HEX value. Or is there an alternative for that?
Upvotes: 1
Views: 3864
Reputation: 395
At last converting the Unicode binary char array to actual binary codepoint like converting 11010000 10100100 to 10000 100100 and then converting to decimal and then to Unicode solved my problem for now.below is the link I use to convert to UTF8 from decimal.
C++ Windows decimal to UTF-8 Character Conversion
resources I used:
https://www.youtube.com/watch?v=vLBtrd9Ar28
https://web.archive.org/web/20180216185523/http://www.zehnet.de/2005/02/12/unicode-utf-8-tutorial/
Upvotes: 0
Reputation: 180398
Escape codes such as \xD0
are interpreted by the compiler when encountered in the value of a character or string literal. The compiler replaces them with the corresponding byte (or byte sequence in some cases). They are not meaningful to C at runtime.
You are therefore not only making it harder on yourself but doing altogether the wrong thing by constructing and printing the text of such escape sequences at runtime. What you get is exactly what you should expect. Just print the literal byte sequence you decode from the program input, without any dress-up.
Upvotes: 4