Reputation: 301
I have a question regarding the saving of characters in C char arrays.
I must read text from a file into a array of type "char" (i cannot use unsigned char). When there are certain characters with a value over 127 (e.g. €, ä, ö, ...) it saves them as negative values, but they do often take more space (e.g. € takes 3 negative values).
How can I calculate these negative values back into unsigned characters. Could someone link me to a tutorial or a guide about that issue?
Upvotes: 3
Views: 717
Reputation: 51039
This depends on encoding you use.
Conventional 1-byte encoding cause no problems. Yes, some characters are treated as negative values but they are stay being that characters they were when reading. If you write them back as is, they will be what they were.
Since you are sure you have 3 char
s per euro symbol, you are dealing with some Unicode encoding, like UTF-8.
This means, that you should store them in some multibyte types like wchar_t
. But this contradicting your requirement of storing data in char
.
I suggest you to convert your file into 1-byte encoding first, for example to Win1252. This encoding has 1 byte for euro symbol.
If you wish to work with Unicode, I am afraid it is hard to deal with negative char
. It is traditional to represent Unicode values with positive integers.
Upvotes: 1
Reputation: 28837
char x = 128;
unsigned char y = (unsigned char) x;
printf("%c %u\n", x, y);
Upvotes: -1
Reputation: 9424
I think you should read this: http://www.joelonsoftware.com/articles/Unicode.html
Upvotes: 11